Self-improving Algorithms for Coordinate- Wise Maxima and 

Convex Hulls* 

Kenneth L. Clarkson^ Wolfgang Mulzer* C. Seshadhri^ 

November 6, 2012 



Abstract 

Computing the coordinate-wise maxima and convex hull of a planar point set are probably 
the most classic problems in computational geometry. We give an algorithm for these problems 
in the self-improving setting. We have n (unknown) independent distributions X>i,Z>2, . . . ,T> n 
of planar points. An input set of points (p\,p2, ■ ■ ■ ,p n ) is generated by taking an independent 
sample pi from each T>i, so the input distribution D is the product Ili^i- A self-improving 
algorithm repeatedly gets inputs from the distribution T> (which is a priori unknown) and tries 
to optimize its running time for T>. Our algorithm uses the first few inputs to learn salient 
features of the distribution and then becomes an optimal algorithm for distribution T>. Let 
OPT-MAX-p (resp. OPT-CHu) denote the expected depth of an optimal linear comparison 
tree computing the maxima (resp. convex hull) for distribution T>. Our maxima algorithm 
eventually has an expected running time of 0(OPT-MAX-p + n), even though it did not know 
T> to begin with. Analogously, we have a self-improving algorithm for convex hull computation 
with expected running time 0(OPT-CHp + n log log n). 

Our results require new tools for manipulating linear comparison trees. In particular, we 
show how to convert a general linear comparison tree to a very restricted version, which can then 
be related to the running time of our algorithms. Another interesting feature of our approach is 
an interleaved search, where we try to determine the likeliest point to be extremal with minimal 
computation. This allows the running time to be competitive with the optimal algorithm for 
the distribution T>. 



1 Introduction 

The problems of planar maxima and convex hull computation are classic computational geome- 
try questions and have been studied since at least 1975 [23]. They have well-known 0{n log n) 
time comparison-based algorithms (where n is the number of points), with matching lower bounds. 
Further research has addressed a wide variety of more advanced settings: one can achieve linear ex- 
pected running time for uniformly distributed points in the unit square; output-sensitive algorithms 
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need 0(n log h) time when the output has size h [20]; and there are results for external-memory 
models [18]. 

A major problem with worst-case analysis is that it may not reflect the behavior of real-world 
inputs. Worst-case algorithms are tailor-made for extreme inputs, none of which may occur (rea- 
sonably often) in practice. Average-case analysis, on the other hand, tries to address this problem 
by assuming some fixed distribution on the inputs. For maxima, the property of coordinate-wise 
independence covers a broad range of inputs, and allows a clean analysis [7], but is unrealistic 
even so. Thus, the right distribution to analyze remains a point of investigation. Nonetheless, the 
assumption of randomly distributed inputs is very natural and one worthy of further study. 

The self-improving model. Ailon et al. introduced the self-improving model to address this 
problem with average case analysis [3]. In this model, there is some fixed but unknown input 
distribution T> that generates independent inputs, that is, whole input sets P. The algorithm 
initially undergoes a learning phase, where it processes inputs with a worst-case guarantee but tries 
to learn information about T>. The aim of the algorithm is to become optimal for the distribution T>. 
After seeing some (hopefully small) number of inputs, the algorithm shifts into the limiting phase. 
Now, the algorithm is tuned for T>, and the expected running time is (ideally) optimal for D. A 
self-improving algorithm can be thought of as an algorithm able to attain the optimal average-case 
running time for all, or at least a large class of, distributions T>. 

Following earlier self-improving algorithms, we assume that the input has a product distribution. 
An input P = {p\,p%, ... ,p n ) is a set of n points in the plane. Each pi is generated independently 
from a distribution T>i, so the probability distribution of P is the product T>i. The T>iS themselves 
are arbitrary, and the only assumption made is their independence. There are lower bounds [2] 
showing that some restriction on T> is necessary for a reasonable self-improving algorithm, as we 
shall explain later. 

The first self-improving algorithm was for sorting, and it was later extended to Delaunay trian- 
gulations [2,11]. These results show that entropy-optimal performance is achievable in the limiting 
phase. Later, Bose et al. [5] described odds-on trees, which provide a general method to obtain 
self-improving versions for several query problems, such as planar point location, orthogonal range 
searching, or point-in-polytope queries. 

2 Results 

Our main results are self-improving algorithms for planar coordinate-wise maxima and convex 
hulls over product distributions. Before we can state our theorems formally, we need some basic 
definitions. First, we explain what it means for algorithms to be optimal for a distribution T>. This 
in turn requires a notion of certificates, which allow the correctness of the output to be verified in 
0(n) time. Any procedure for computing maxima or convex hulls must provide some "reason" to 
deem an input point p non-maximal/non-extremal. Most current algorithms implicitly give exactly 
such certificates [17,21,23]. 

Definition 2.1. Let P be a planar point set. A maxima certificate 7 for P consists of: (i) the 
sequence of the indices of the maximal points in P, sorted from left to right; (ii) for each non- 
maximal point, a per-point certificate of non-maximality, which is simply the index of an input 
point that dominates it. We say that a certificate 7 is valid for an input P if 7 satisfies these 
conditions for P. 
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Figure 1: Per-point certificates for maxima and convex hulls. Left: both q\ and q2 can be possible 
per-point certificates of non-maximality for p. Right: Both qi,q3 and q2,qA are possible witness 
pairs for p. 

We denote the upper convex hull of P by conv(i- > ). A point p G P is called extremal if it appears 
on conv(P), and non-extremal, otherwise. For two points p,q £ P, we define the upper semislab 
for p and q, uss(p, q) as the planar region that is bounded by the upward vertical rays through p 
and q and the line segment pq. The lower semislab for p and q, lss(p, g) is defined similarly. We 
call two points q, r G P a witness pair for a non-extremal point p if p G lss(g, r). 

Definition 2.2. Given a point set P, a convex hull certificate 7 for P has: (i) a list of the extremal 
points in P, sorted from left to right; (ii) a list that contains a witness pair for each non-extremal 
point in P. Each point in 7 is represented by its index in P. 

The model of computation that we use to define optimality is a linear computation tree that 
generates query lines using the input points. In particular, our model includes the usual CCW-test 
that forms the basis for many geometric algorithms. 

Let £ be a directed line. We use £ + to denote the open halfplane to the left of I and £~ to 
denote the open halfplane to the right of I. 

Definition 2.3. A linear comparison tree T is a rooted binary tree such that each node v ofTis 
labeled with a query of the form "p G t^T". Here p denotes an input point and £ v denotes a directed 
line. The line i v can be obtained in four ways: 

1. it can be a line independent of the input {but dependent on the node v); 

2. it can be a line with a slope independent of the input {but dependent on v) passing through a 
given input point; 

3. it can be a line through an input point and through a point q v independent of the input {but 
dependent on v); 

4- it can be the line defined by two distinct input points. 

A linear comparison tree is restricted if it has only only nodes of type {1). 

Let T be a linear comparison tree and v be a node of T. Note that v corresponds to a region 
1Z V C M 2n such that an evaluation of T on input P reaches v if and only if P G 1Z V . If T is 
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Figure 2: The four types of comparisons: we can compare (1) with a fixed line; (2) a line through 
an input point p, with a fixed slope s; (3) a line through an input point p and a fixed point q; and 
(4) a line through two input points p\ and P2 ■ 

restricted, then TZ V is the Cartesian product of a sequence (Ri, R2, ■ ■ ■ , i?n) of polygonal regions. 
Given 7~, there exists exactly one leaf v{P) that is reached by the evaluation of T on input P. We 
now define what is means for a linear comparison tree to compute the maxima or convex hull. 

Definition 2.4. Let T he a linear comparison tree. If each leaf v of T is labeled with a maxima 
certificate that is valid for every input P with v = v{P), we say that T computes the maxima of 
P. The notion that T computes the convex hull of P is defined analogously. 

The depth of node v in T, denoted by d v , is the length of the path from the root of T to v . The 
expected depth of T over V, dx>(T), is defined as Ep^x>[^(p)]- Consider some comparison based 
algorithm A that is modeled by such a tree T ■ The expected depth of T is a lower bound on the 
number of comparisons performed by A. 

Let T be the set of trees that compute the maxima of n points. We define OPT-MAXx> = 
inf-7- g T dx>{T). This is a lower bound on the expected time taken by any linear comparison tree to 
compute the maxima of inputs distributed according to V. We would like our algorithm to have a 
running time comparable to OPT-MAXx>. 

Theorem 2.5. Let e > be a fixed constant and T>\, T>2, • • • , ^ n be independent planar point 
distributions. The input distribution is D = Y\ i T>i. There is a self-improving algorithm to compute 
the coordinate-wise maxima whose expected time in the limiting phase is 0(e~ 1 (n+ OPT-MAXx>)). 
The learning phase lasts for 0(n £ ) inputs and the space requirement is 0(n 1+£ ). 

Analogously, we have a self-improving result for convex hulls. Unfortunately, it is slightly sub- 
optimal. As before, we set OPT-CHx> = inf^gT du(T), where now T is the set of trees computing 
the convex hull of n points. The conference version [9] claimed an optimal result, but it was flawed. 
Our new analysis is simpler and closer in style to the maxima result. 
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Figure 3: (i) A point set and distribution that is slow in other models: the upper hull U is fixed, 
while the points p n /2+i, ■ ■ ■ ,Pn roughly constitute a random permutation of the points in L. (ii) 
Point pi is either at ph or p£, and its location affects the processing for the other points. 

Theorem 2.6. Let e > be a fixed constant and T>\, T>2, ■ ■ ■ be independent planar point 
distributions. The input distribution is T> = Y\iT>i. There is a self-improving algorithm to compute 
the convex hull whose expected time in the limiting phase is 0(nloglogn + e~ l (n + OPT-CHx,)). 
The learning phase lasts for 0(n 6 ) inputs and the space requirement is 0{n 1+e ). 

Prior Algorithms. With a plethora of planar maxima and convex hull algorithms available, it 
might seem that there could already be one that is self-improving. A single example shows that for 
several prior algorithmic approaches, this is not so. We will focus on convex hulls throughout this 
discussion, though it is equally valid for maxima as well. Refer to the first example in Figure 3. 
The input points in the example are in two groups: a lower group L, that is not on the upper hull, 
and an upper group U, arranged so that all points in U are vertices of the upper hull. The sets 
L and U have the same number of points n/2. Suppose that the input distribution D takes the 
following form: the points p\ through p n /2 take fixed positions to form U, and for pi with i > n/2, 
a random point of L is chosen to be pi (some points of L may be chosen more than once). Thus the 
"lower" points are essentially a randomly permuted subset of L, with the number of distinct points 
Q(n). The output convex hull will always have vertex set U, while the points of L are all shown 
to be non-extremal because they are below the line segment with endpoints p\ and p n /2- Thus an 
optimal algorithm can output the convex hull in O(n) time. 

However, in several other algorithmic models, this example needs fi(nlogn) time. Since the 
output size is n/2, an output-sensitive algorithm requires il(n log n). This implies that the structural 
entropy introduced by Barbay [4] is here Q(nlogn). Since the expected number of upper hull 
vertices of a random r-subset of C/UL is r/2, a randomized incremental algorithm takes 0(nlogn) 
time to compute conv(J) [12]. Because computation of the hull takes linear time for points sorted by 
a coordinate, say the x-coordinate, it might be thought that self-improving algorithms for sorting [2] 
would be helpful; however, since the entropy of the sorting output is fi(nlogn), such an algorithm 
would not give a speedup here. By a similar argument, a self-improving algorithm for Delaunay 
triangulations will also not help. For this input distribution, the entropy of the output Delaunay 
triangulations is fi(nlogn) (because of the randomness among the points of L). 

Afshani et al. [1] gave instance optimal algorithms, that are "optimal" (in certain restricted 
models) for instances of the planar convex hull problem. In their setup, all inputs from this 
distribution would be considered "difficult" and would be dealt with in Q(nlogn) time. The main 
difference is that they are interested in an optimal algorithm for any given input considered as a 
set, whereas we want an optimal algorithm for an ensemble of inputs from a special distribution, 
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where each input is considered as a sequence. Indeed, for our algorithm it is essential to know the 
identity of each point (that is, to know that z% is the i-th point). 

To conclude, we also mention the paradigm of preprocessing imprecise points for faster (De- 
launay) triangulation and convex hull computation [6,16,19,22,24]. Here, we are given a set 1Z 
of planar regions, and we would like to preprocess 1Z in order to quickly find the (Delaunay) tri- 
angulation or convex hull for any point set which contains exactly one point from each region in 
7Z. This setting is adversarial, but if we only consider point sets where a point is randomly drawn 
from each region, it can be regarded as a special case of our problem. In this view, these results 
give us bounds for the eventual running time of a self-improving algorithm if D draws its points 
from disjoint planar regions, even though the self-improving algorithm has no prior knowledge of 
the special input structure. A particularly interesting scenario is the following: given a set of lines 
L in the plane, we would like to preprocess L such that given any point set P with one point from 
each line in L, we can compute the convex hull of P quickly. It is possible to achieve near-linear 
expected time for the convex hull computation, albeit at the expense of a quadratic preprocessing 
time and storage requirement [16]. Our self-improving algorithm now shows that if each point in 
P is drawn randomly from a line in L, we can still achieve near- linear expected running time, but 
with an improved storage requirement of 0(n 1+£ ). 

Why is this hard? Since the convex hull is essentially part of the Delaunay triangulation, 
generally algorithms for the former are simpler than those for the latter; since a self-improving 
algorithm for Delaunay triangulation was already known, it seems natural to assume that a planar 
convex hull algorithm in the same model should follow easily, and be simpler than the Delaunay 
algorithm. 1 

As we explained above, this is not true. Let us try to give some more intuition for this. The 
reason seems to be the split in output status among the points: some points are simply listed as 
extremal, and others need a certificate of non-extremality; the certificate may or may not be easy 
to find. In the first example of Figure 3, the certificates of non-extremality are all "easy" . However, 
if the points of L are placed just below the edges of the upper hull, then to find the certificate for 
a given point in pi for i > n/2, a search must be done to find the single hull edge above pi, and the 
certificates are "hard". A simple example shows that even though the points are independent, the 
convex hull can exhibit very dependent behavior. In the second example of Figure 3, point p\ can 
be at either p^ or p#, but the other points are fixed. The other points become extremal depending 
on the position of p\. This makes life rather hard for entropy-optimality, since for p\ = pi the 
ordering of the remaining points must be determined, but otherwise that is not necessary. 

In our algorithm, and plausibly for any algorithm, some form of point location must be done 
for each pj. (Here, one search we do involves dividing the plane by x-coordinate, and searching 
among the resulting vertical slabs.) If point pi is "easily" shown to be non-extremal, then the 
point location search should be correspondingly shallow. However, it doesn't seem to be possible 
to determine, in advance, how deep the point location could go: imagine the points L of Figure 3 
doubled up and placed at both the "hard" and "easy" positions, and pi for i > n/2 chosen randomly 
from among those positions; the necessary search depth can only be determined using the position 
of the point. Also, the certificates may be easy to find, if we know which points are extremal, or 
at least "near extremal," but determining that is why we are doing the search to begin with. 

1 The authors certainly thought so for awhile, before painfully learning otherwise. 
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3 Preliminaries and basic facts 



First, we give some basic definitions and concepts that will be used throughout the paper. 

3.1 Basic notation 

We use c to denote a sufficiently large constant. Our input point set is called P = (pi,. ■ ■ ,p n ), and 
it comes from a product distribution D = Y\2=i^i- ^or any point p £ P, we write x(p) and y(p) 
for the x and y coordinates of p. As mentioned, given a directed line £, we write £ + for the open 
halfplane to the left of £, and £~ for the open halfplane to the right of I. If R C M? is a planar 
region, then a halving line for R with respect to a distribution Pj is a line I such that 

Pr [p € £ + H R] = Pr [p € £~ l~l .R]. 

Note that if Pr^u^ £ -R] = 0, every line is a halving line for R. 

We use the phrase "with high probability" for any probability larger than 1 — n - ^ 1 ). The 
constant buried in the 0,(1) can be made arbitrarily large by increasing the constant c. We will take 
union bounds over polynomially many (with a fixed exponent, usually at most n 2 ) low probability 
events and still have a low probability bound. 

3.2 Linear comparison trees 

In this section, we will discuss some basic properties of linear comparison trees. The crucial lemma 
shows that any linear comparison tree can be converted to a convenient form with only a constant 
blow-up in depth. 

Let T be a linear comparison tree. We remind the reader that for each node v of T there exists 
a set 1Z V C R 2n such that an evaluation of T on input P reaches v if and only if P € 1Z V . The next 
proposition demonstrates the usefulness of restricted linear comparison trees. They ensure that the 
sets 1Z V are Cartesian products of planar regions, which will later allow us to analyze each input 
point independently. 

Proposition 3.1. Let T be a restricted linear comparison tree, and let v be a node in T ■ Then 
there exists a sequence {R\, R2, ■ ■ ■ , R n ) of (possibly unbounded) convex planar polygons such that 
1Z V = nr=i I n other words, the evaluation of T on P = (pi, . . . ,p n ) reaches v if and only if 
Pi £ Ri for all i . 

Proof. The proof is by induction on d v . For the root, we set R\ = ■ ■ ■ = R n = M 2 . If d v > 1, 
let v' be the parent of v. By induction, there are planar convex polygons R[ with 1Z V > = niLi fy- 
Furthermore, since T is restricted, v' is labeled with a test of the form u pj G Ct?", where the line 
£ v i is independent of the input. Thus, we can take Ri = R[ for i / j, and depending on whether v 
is the left or the right child of v', we set Rj = R'j n £+, or Rj = R 1 - n£~,. □ 

Next, we will show how to restrict the linear comparison trees even further, so that the depth 
of a node v is related to the probability that v is reached by a random input P ~ V. This will 
allow us to compare the running times of our algorithms with the depth of a near optimal tree. 

Definition 3.2. Let T be a restricted linear comparison tree. We call T entropy-sensitive if the 
following holds for every node v of T: let 1Z V = Yli=i ^ an< ^ v ^ e labeled with a test of the form 
" Pj e£tV>. Then£ v is a halving line for Rj . 
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The following proposition shows that the depth of a node in an entropy-sensitive linear com- 
parison tree is related to the probability that it is being visited. 

Proposition 3.3. Let v be a node in an entropy-sensitive comparison tree, and let 7Z V = Y\™ Ri. 
Then 

n 

d v = ~y~] log Pr [pi G Ri] . 

i=l 

Proof. We use induction on d v . The root has depth and all probabilities are 1, so the claim holds 
in this case. Now let d v > 1 and v' be the parent of v. Write 1Z V > = YYl=i R\ an d TZ V = 111=1 Ri- 
By induction, we have d v t = — Yl?=i l°gP r [Pi £ R'j\- Since T is entropy-sensitive, v' is labeled with 
a test of the form u pj G £ /?", where £ v > is a halving line for R'j, i.e., Pi[pj G i?^ n = Pv[pj G 
Ji£ n £~] = Pr[pj G R'jj/2. Since ifc = i?< for i / j and Rj = R^ n £^ or Jfy = n £~,, it follows 
that 

n n 

- Vlog Pr [pj G Ri] = 1 - Vlog Pr [ Pi G flj] = 1 + 4' = 4, 
i=i i=i 
as desired. □ 

In Section 4, we will show that for our purposes, it suffices to restrict our attention to entropy- 
sensitive comparison trees. The following lemma is a very important part of the proof, as it gives 
handles on OPT-MAX and OPT-CH. 

Lemma 3.4. Let T a finite linear comparison tree and D be a product distribution over points. 
Then there exists an entropy-sensitive comparison tree T' with expected depth du{T') = 0(dx>(l~)), 
as d%>(7~) — > oo. 

3.3 Search trees and restricted searches 

We define the notion of restricted searches, central to our proof of optimality. Let U be an ordered 
finite set and J 7 be a distribution over U. For any element j £ U, we write qj for the probability 
of j according to T . For any sequence of numbers aj, j G U, and for any interval 5" C U, we use 
as to denote X^eS a i- Thus, if S is an interval of U, then q s is the total probability of S. 

Let T be a search tree over U. It will be convenient to think of T as (at most) ternary, where 
each node has at most 2 children that are internal nodes. We associate each internal node v of T 
with an interval S v C U in the obvious way (any element in S v encounters v along its search path). 
In our application of the lemma, U will just be the set of leaf slabs of a slab structure S, see below. 
We now introduce some definitions regarding restricted searches and search trees. 

Definition 3.5. Consider an interval S of U. An 5-restricted distribution J^s is given by the 
probabilities Pr j^ s \j] := Sj/ Yl r eU s r f or a ^ 3 ^ ^? where the Sj, j G U, have the property that 
< sj < qj, if j G S; and Sj = 0, otherwise. 

Let j G S. An 5-restricted search for j is a search for j in T that terminates as soon as it 
reaches the first node v with S v C S. 

Definition 3.6. Let /j, G (0,1) be a parameter. A search tree T over U is ^-reducing if for any 

internal node v and for any non-leaf child w of v, we have qs w < fJ-qs v - 

A search tree T is a-optimal for restricted searches over J- if for every interval S C U and 
every S-restricted distribution Ts, the expected time of an S-restricted search over Ts is at most 
a(l — log ss)- (The values Sj are as in Definition 3.5.) 
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Figure 4: Left: A universe with 6 elements and a search tree on it. The nodes in the tree are 
ternary, with at most two internal children. The node v corresponds to the interval S v = {1, 2, 3}. 
Right: A vertical slab structure with 6 leaf slabs. (Note that the left and right unbounded slab 
also count as leaf slabs). S is a (general) slab that consists of 3 leaf slabs, i.e., \S\ =3. 

We give the main lemma about restricted searches. A tree that is optimal for searches over T 
also works for restricted distributions. The proof is given in Section 7. 

Lemma 3.7. Suppose T is a fi-reducing search tree for J- . Then T is 0(1/ log(l/ fJ,))- optimal for 
restricted searches over J-. 

3.4 Data structures 

We describe data structures that will be used in both of our algorithms. A vertical slab structure 
S is a sequence of vertical lines that partition the plane into vertical regions, called leaf slabs. (We 
will consider the latter to be the open regions between the vertical lines. Since we assume that our 
distributions are continuous, we abuse notation and consider the leaf slabs to partition the plane.) 
More generally, a slab is the region between any two vertical lines of S. The size of a slab S, \S\, is 
the number of leaf slabs it contains. The size of the slab structure is the total number of leaf slabs. 
We denote it by |S|. Furthermore, for any slab S, the probability that pi ~ T>i is in S is denoted 
by q(i,S). 

Both our algorithms will construct special slab structures in the learning phase. Similar struc- 
tures were first constructed in [2], and we follow the same strategy. All missing proofs are given in 
Section 8. 

Lemma 3.8. We can construct a slab structure S with 0(n) leaf slabs such that, with probability 
1 — ra -3 over the construction of S, the following holds. For a leaf slab A of S, let X\ denote 
the number of points in a random input P that fall into A. For every leaf slab A of S, we have 
E[A|] = O(l). The construction takes O(logn) rounds and 0(n log 2 n) time. 

The algorithms construct a collection of specialized search trees for S for each distribution XV 
It is important that these trees can be represented in small space. The following lemma gives the 
details of the constructed search trees. 

Lemma 3.9. Let e > be a fixed parameter and S a slab structure with 0(n) leaf slabs. In 
0(n e ) rounds and 0(n l+£ ) time, we can construct search trees T\, T2, . . ., T n over S such that the 
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following holds: (i) the trees can be represented in 0{n l+£ ) total space; (ii) with probability l — n 3 
over the construction of the TiS, every T{ is 0(l/e)- optimal for restricted searches overT>i. 

We will also require a simple structure that maintains (key, index) pairs. The indices are all 
distinct and always in [n]. The keys are elements from the ordered universe [|S|]. We store these 
pairs in a data structure that allows the operations insert, delete (deleting a pair), find-max 
(finding the maximum key among the stored pairs), and decrease-key (decreasing the key of some 
stored pair). For decrease and decrease-key, we assume the input is a pointer into the data 
structure to the stored pair to operate on. 

Claim 3.10. Suppose there are x find-max operations and y decrease-key operations. We can 
implement the data structure L(A) such that the total time for the operations is 0(n + x + y). The 
storage requirement is O(n). 

Proof. We represent L(A) as an array of lists. For every k £ [|S|], we keep a list of indices whose key 
values are k. We maintain m, the current maximum key. The total storage is O(n). A find-max 
trivially takes O(l) time, and an insert is done by adding the element to the appropriate list. For 
a delete, we remove the element from the list (assuming appropriate pointers are available). We 
now have to update the maximum. If the list at m is non-empty, no action is required. If it is 
empty, we check sequentially whether the list at m — 1, m— 2, ... is empty. This will eventually lead 
to the maximum. To do a decrease-key, we delete, insert, and then update the maximum. 

Note that since all key updates are decrease-keys, the maximum can only decrease. Hence, 
the total overhead for scanning for a new maximum is 0(n). □ 

4 Linear comparison trees 

A major challenge of self-improving algorithms is the strong requirement of optimality for the 
distribution D. We focus on the model of linear comparison trees, and let T be an optimal tree 
for distribution D. (There may be distributions where such an exact T does not exist, but we 
can always find one that is nearly optimal.) One of our key insights is that when D is a product 
distribution, we can convert T to a restricted comparison tree whose expected depth is only a 
constant factor worse. In other words, there always exists a near-optimal restricted comparison 
tree for our problem. Furthermore, we will see that this tree can be made entropy-sensitive. 

In such a tree, each leaf v is labeled with a sequence of regions 1Z V = (R%,R2, . . . ,R n ). Any 
input P = (pi,P2, ■ ■ ■ ,Pn) such that pi G Ri for all i, will lead to v. Since the distributions are 
independent, we can argue that the probability that an input leads to v is Y\ { Pr Pi ^x>i [Pi £ Ri]- 
Furthermore, by entropy-sensitivity, the depth of v is — Y2i l°gPr[pj £ Ri]. This gives us a concrete 
bound that we can exploit. 

It now remains to show that if we start with a random input from 1Z V , the expected running 
time is bounded by the sum given above. We will argue that for such an input, as soon as the 
search for pi locates it inside Ri, the search will terminate. This leads to the optimal running time. 

The purpose of this section is to prove Lemma 3.4. This is done in two steps. First, we go from 
a linear comparison tree to a restricted linear comparison tree. 

Lemma 4.1. Let T a finite linear comparison tree and T> be a product distribution over points. 
Then there exists a restricted comparison tree T' with expected depth dx>{T') = 0(dj)(T)), as 
dv(T) -> oo. 
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Figure 5: The different cases in the proof of Claim 4.3. 

The proof of Lemma 4.1 is in Section 4.1. Next, we show how to go from a restricted linear 
comparison tree to an entropy-sensitive comparison tree. 

Lemma 4.2. Let T a restricted linear comparison tree. Then there exists an entropy-sensitive 
comparison tree T with expected depth d-r> = 0(df). 

The proof of Lemma 4.1 is in Section 4.2. Given Lemmas 4.1 and 4.2, the proof Lemma 3.4 is 
immediate by applying the lemmas in succession. 

4.1 Reducing to restricted comparison trees 

We will describe a transformation from T into a restricted comparison tree with similar depth. The 
first step is to show how to represent a single comparison by a restricted linear comparison tree, 
provided that P is drawn from a product distribution. The final transformation basically replaces 
each node of T by the subtree given by the next claim. For convenience, we will drop the subscript 
of T> from dx>, since we focus on a fixed distribution. 

Claim 4.3. Consider a comparison C as described in Definition 2.3, where the comparisons are 
listed in increasing order of complexity. Let D' be a product distribution for P such that each pi 
is drawn from a polygonal region Ri. Then either C is the simplest, type (1) comparison, or there 
exists a restricted linear comparison tree TA that resolves the comparison C such that the expected 
depth of Tq (over the distribution D' ) is 0(1), and all comparisons used in Tq are less complex 
than C . 

Proof, v is of type (2). This means that v needs to determine whether an input point pi lies to 
the left of the directed line I through another input point pj with a fixed slope a. We replace this 
comparison with a binary search. Let Rj be the region in T>' corresponding to pj. Take a halving 
line i\ for Rj with slope a. Then perform two comparisons to determine on which side of i\ the 
inputs pi and pj lie. If pi and pj lie on different sides of £\, we declare success and resolve the 
original comparison accordingly. Otherwise, we replace Rj with the appropriate new region and 
repeat the process until we can declare success. Note that in each attempt the success probability 
is at least 1/4. The resulting restricted tree Tq can be infinite. Nonetheless, the probability that 
an evaluation of Tq leads to a node of depth k is at most 2~^ k \ so the expected depth is 0(1). 
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v is of type (3). Here the node v needs to determine whether an input point p, lies to the left of 
the directed line I through another input point pj and a fixed point q. 

We partition the plane by a constant-sized family of cones, each with apex q, such that for each 
cone V in the family, the probability that line qp] meets V (other than at q) is at most 1/2. Such 
a family could be constructed by a sweeping a line around q, or by taking a sufficiently large, but 
constant-sized, sample from the distribution of pj, and bounding the cones by all lines through 
q and each point of the sample. Such a construction has a non-zero probability of success, and 
therefore the described family of cones exists. 

We build a restricted tree that locates a point in the corresponding cone. For each cone V, we 

can recursively build such a family of cones (inside V), and build a tree for this structure as well. 

Repeating for each cone, this leads to an infinite restricted tree T c . We search for both pi and pj in 

Tq. When we locate pi and pj in two different cones of the same family, then comparison between 

Pi and qpj is resolved and the search terminates. The probability that they lie in the same cones 

of a given family is at most 1/2, so the probability that the evaluation leads to k steps is at most 
2~fi(fc) 

v is of type (4). Here the node v needs to determine whether an input point pi lies to the left of 
the directed line t through input points pj and pk ■ 

We partition the plane by a constant-sized family of triangles and cones, such that for each 
region V in the family, the probability that the line through pj and p^ meets V is at most 1/2. 
Such a family could be constructed by taking a sufficiently large random sample of pairs pj and 
Pk and triangulating the arrangement of the lines through each pair. Such a construction has a 
non-zero probability of success, and therefore such a family exists. (Other than the source of the 
random lines used in the construction, this scheme goes back at least to [10]; a tighter version, 
called a cutting, could also be used [8].) 

When computing C, suppose pi is in region V of the family. If the line PjPk does not meet 
V, then the comparison outcome is known immediately. This occurs with probability at least 1/2. 
Moreover, determining the region containing pi can be done with a constant number of comparisons 
of type (1), and determining if pjp~k meets V can be done with a constant number of comparisons of 
type (3); for the latter, suppose V is a triangle. If pj G V, then pjp~k meets V. Otherwise, suppose 
Pk is above all the lines through pj and each vertex of V; then pjp~k does not meet V. Also, if p^ is 
below all the lines through pj and each vertex, then pjp~k does not meet V. Otherwise, PjPk meets 
V. So a constant number of type (1) and type (3) queries suffice. 

By recursively building a tree for each region V of the family, comparisons of type (4) can be 
done via a tree whose nodes use comparisons of type (1) and (3) only. Since the probability of 
resolving the comparison is at least 1/2 with each family of regions that is visited, the expected 
number of nodes visited is constant. □ 

Given Claim 4.3, it is now easy to prove Lemma 4.1. 

Proof of Lemma 4-1- We transform T into a tree T' that has no comparisons of type (4), by using 
the construction of Claim 4.3 where nodes of type (4) are replaced by a tree. We then transform T 1 
into a tree T" that has no comparisons of type (3) or (4), and finally transform T'" into a restricted 
tree. Each such transformation is done in the same general way, using one case of Claim 4.3, so we 
focus on the first one. 

We incrementally transform T into the tree T . In each step, we have a partial restricted 
comparison tree T" that will eventually become T' . Furthermore, during the process each node of 
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T is in one of three different states. It is either finished, fringe, or untouched. Finally, we have a 
function S that assigns to each finished and to each fringe node of T a subset S(v) of nodes in T" ■ 

The initial situation is as follows: all nodes of T are untouched except for the root which is 
fringe. Furthermore, the partial tree T" consists of a single root node r and the function S assigns 
the root of T to the set {r}. 

Now our transformation proceeds as follows. We pick a fringe node v in T, and mark v as 
finished. For each child v' of v, if v' is an internal node of T , we mark it as fringe. Otherwise, we 
mark v' as finished. Next, we apply Claim 4.3 to each node uu G S(v). Note that this is a valid 
application of the claim, since w is a node of T" , a restricted tree. Hence TZ W is a product set, and 
the distribution V restricted to 1Z W is a product distribution. Hence, replace each node w G S(v) 
in T" by the subtree given by Claim 4.3. Now S(v) contains the roots of these subtrees. Each leaf 
of each such subtree corresponds to an outcome of the comparison in v. (Potentially, the subtrees 
are countably infinite, but the expected number of steps to reach a leaf is constant.) For each child 
v' of v, we define S(v') as the set of all such leaves that correspond to the same outcome of the 
comparison as v'. We continue this process until there are no fringe nodes left. By construction, 
the resulting tree T is restricted. 

It remains to argue that dji = 0(df). 

Let v be a node of T ■ We define two random variables X v and Y v . The variable X v is the 
indicator random variable for the event that the node v is traversed for a random input P ~ T>. 
The variable Y v denotes the number of nodes traversed in T' that correspond to v (i.e., the number 
of nodes needed to simulate the comparison at v, if it occurs). We have dj- = ^2 v£ -j-E[X v ], because 
if the leaf corresponding to an input P ~ T> has depth d, exactly d nodes are traversed to reach 
it. We also have dj' = X^ft-^P^]' s i nce each node in T' corresponds to exactly one node v in T. 
Claim 4.4 below shows that E[Yy] = 0(E[X w ]), which completes the proof. □ 

Claim 4.4. E^] < cE[X v ] 

Proof. Note that E[X V ] = Pr[X v = 1] = Pr[P G K v \. Since the sets K w , w G S(v), partition TZ V , 
we can write E[Y„] as 

E[Y V \X V = 0] Pr[X v = 0] + B l Y " I P G U ^ Pr t P G 

Since Y v = if P £ K v , we have E[Y V \ X v = 0] = 0. Also, Pr[P G K v ] = X)«, 6 5(t») Pr [ P e ^l- 
Furthermore, by Claim 4.3, we have E[Y V \ P G 1Z W ] < c. The claim follows. □ 

4.2 Entropy-sensitive comparison trees 

We now prove Lemma 4.2. The proof extends the proof of Lemma 4.1, via an extension to Claim 4.3. 
We can regard a comparison against a fixed halving line as simpler than an comparison against 
an arbitrary fixed line. Our extension of Claim 4.3 is the claim that any type (1) node can be 
replaced by a tree with constant expected depth, as follows. A comparison pi G i + can be replaced 
by a sequence of comparisons to halving lines. Similar to the reduction for type (2) comparisons 
in Claim 4.3, this is done by binary search. That is, let l\ be a halving line for Ri parallel to I. 
We compare pi with I. If this resolves the original comparison, we declare success. Otherwise, we 
repeat the process with the halving line for the new region R\. In each step, the probability of 
success is at least 1/2. The resulting comparison tree has constant expected depth; we now apply 
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the construction of Lemma 4.1 to argue that for a restricted tree T there is an entropy-sensitive 
version T 1 whose expected depth is larger by at most a constant factor. 

5 A self-improving algorithm for coordinate-wise maxima 

We begin with an informal overview of the algorithm. 

If the points of P are sorted by x-coordinate, the maxima of P can be found easily by a right- 
to-left sweep over P: we maintain the largest y-coordinate Y of the points traversed so far; when 
a point p is visited in the traversal, if y(p) < Y, then p is non-maximal, and the point pj with 
Y = y{pj) gives a per-point certificate for p's non-maximality. If y(p) > Y, then p is maximal, and 
we can update Y and put p at the beginning of the certificate list of maxima of P. 

This suggests the following approach to a self-improving algorithm for maxima: sort P with a 
self-improving sorter and then use the traversal. The self-improving sorter of [2] works by locating 
each point of P within the slab structure S of Lemma 3.8 using the trees Tj of Lemma 3.9. 

While this approach does use S and the Tj's, it is not optimal for maxima, because the time 
spent finding the exact sorted order of non-maximal points may be wasted: in some sense, we are 
learning much more information about the input P than necessary. To deduce the list of maxima, 
we do not need the sorted order of all points of P: it suffices to know the sorted order of just 
the maxima! An optimal algorithm would probably locate the maximal points in S and would 
not bother locating "extremely non-maximal" points. This is, in some sense, the difficulty that 
output-sensitive algorithms face. 

As a thought experiment, let us suppose that the maximal points of P are known to us, but not 
in sorted order. We search only for these in S and determine the sorted list of maximal points. We 
can argue that the optimal algorithm must also (in essence) perform such a search. We also need to 
find per-point certificates for the non-maximal points. We use the slab structure S and the search 
trees, but now we shall be very conservative in our searches. Consider the search for a point p^. At 
any intermediate stage of the search, pi is placed in a slab S. This rough knowledge of p^s location 
may already suffice to certify its non-maximality: let m denote the leftmost maximal point to the 
right of S (since the sorted list of maxima is known, this information can be easily deduced). We 
check if m dominates pi. If so, we have a per-point certificate for and we promptly terminate 
the search for pi. Otherwise, we continue the search by a single step and repeat. We expect that 
many searches will not proceed too long, achieving a better position to compete with the optimal 
algorithm. 

Non-maximal points that are dominated by many maximal points will usually have a very short 
search. Points that are "nearly" maximal will require a much longer search. So this approach 
should derive just the "right" amount of information to determine the maxima output. But wait! 
Didn't we assume that the maximal points were known? Wasn't this crucial in cutting down the 
search time? This is too much of an assumption, and because the maxima are highly dependent on 
each other, it is not clear how to determine which points are maximal before performing searches. 

The final algorithm overcomes this difficulty by interleaving the searches for sorting the points 
with confirmation of the maximality of some points, in a rough right-to- left order that is a more 
elaborate version of the traversal scheme given above for sorted points. The searches for all points 
Pi (in their respective trees Tj) are performed "together", and their order is carefully chosen. At 
any intermediate stage, each point pi is located in some slab Si, represented by some node of its 
search tree. We choose a specific point and advance its search by one step. This order is very 
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Figure 6: Every point in Rj dominates every point in R4. 

important, and is the basis of our optimality. The algorithm is described in detail and analyzed in 
Section 5.2. 

5.1 Optimality 

We use a more restricted form of certificates and search trees that allows for easier proofs of 
optimality. 

Proposition 5.1. Consider a leafv of a restricted linear comparison tree T computing the maxima. 
Let Ri be the region associated with non-maximal point pi £ P in 7Z V . There exists some region Rj 
associated with an extremal point pj such that every point in Rj dominates every point in Ri . 

Proof. The leaf v is associated with a certificate 7, such that 7 is valid for every input that reaches 
v. Consider the point p\ G P. The certificate 7 associates the non-maximal point pi with pj such 
that pj dominates pi. For any input P reaching v, pj dominates pi. Let us first argue that pj can 
be assumed to be maximal. Construct a digraph G over the vertex set [n] where directed edge 
(u,v) is present if (according to 7), p u is dominated by p v . All vertices have an outdegree of 1, 
and there are no cycles in G (since domination is a transitive relationship). Hence, G is a forest of 
trees with edges directed towards the root. The roots are all maximal vertices, and any point in a 
subtree is dominated by the point corresponding to the root. Therefore, we can rewrite 7 so that 
all dominating points are extremal. 

Since T is restricted, the region 1Z V C M 2n corresponding to v is a Cartesian product of polygonal 
regions (Ri, R2, ■ ■ ■ , R n )- Suppose there exist two subregions R[ C Ri and Rj C Rj such that R'j 
does not dominate R\ . Consider the input P where pi G R\ and pj G R'j . The remaining points are 
arbitrarily chosen in respective Rts. The certificate 7 is not valid for P, contradicting the nature 
of T . Hence, all points in Rj dominate all points in Ri, and pj is extremal. □ 

We now enhance the notion of a certificate (Definition 2.1) to make it more useful for our 
algorithm's analysis. For technical reasons, we want points to be "well-separated" according to the 
slab structure S. By Proposition 5.1, every non-maximal point is associated with a dominating 
region. 

Definition 5.2. Let S be a slab structure. A certificate for an input P is called S-labeled if the 
following holds. Every maximal point is labeled with the leaf slab of S containing it. Every non- 
maximal point is either placed in the containing leaf slab, or is separated from a dominating region 
by a slab boundary. 
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We say that a tree T computes the S-labeled maxima if the leaves are labeled with S-labeled 
certificates. 

Lemma 5.3. There exists an entropy-sensitive comparison tree T computing the S-labeled maxima 
whose expected depth over D is 0(n + OPT- MAX©). 

Proof. Start with an optimal linear comparison tree T' that computes the maxima. At every leaf, 
we have a list M with the maximal points in sorted order. We merge M with the list of slab 
boundaries of S to label each maximal point with the leaf slab of S containing it. We now deal 
with the non- maximal points. Let Ri be the region associated with a non-maximal point pi, and 
Rj be the dominating region. Let A be the leaf slab containing Rj. Note that the x-projection of 
Ri cannot extend to the right of A. If there is no slab boundary separating Ri from Rj, then Ri 
must intersect A. With one more comparison, we can place pi inside A or strictly to the left of it. 
All in all, with O(n) more comparisons than T' , we have a tree T" that computes the S-labeled 
maxima. Hence, the expected depth is OPT-MAXx> + 0(n). Now we apply Lemma 3.4 to T" to 
get an entropy-sensitive comparison tree T computing the S-labeled maxima with expected depth 
0(n + OPT-MAX©). □ 

5.2 The algorithm 

In the learning phase, the algorithm constructs a slab structure S and search trees Tj, as given 
in Lemmas 3.8 and 3.9. Henceforth, we assume that we have these data structures, and will 
describe the algorithm in the limiting (or stationary) phase. Our algorithm proceeds by searching 
progressively each point pi in its tree Tj. However, we need to choose the order of the searches 
carefully. 

At any stage of the algorithm, each point pi is placed in some slab S%. The algorithm maintains 
a set A of active points. An inactive point is either proven to be non-maximal, or it has been placed 
in a leaf slab. The active points are stored in a data structure L(A), as in Claim 3.10. Recall that 
L(A) supports the operations insert, delete, decrease-key, and f ind-max. The key associated 
with an active point pi is the right boundary of the slab Si (represented as an element of [|S|]). 

We list the variables that the algorithm maintains. The algorithm is initialized with A = P, 
and each Si is the largest slab in S. Hence, all points have key |S|, and we insert all these keys 
into L{A). 

1. A, L(A): the list A of active points stored in data structure L(A). 

2. A, B: Let m be the largest key among the active points. Then A is the leaf slab whose right 
boundary is m and B is a set of points located in A. Initially B is empty and m is \S\, 
corresponding to the +oo boundary of the rightmost, infinite, slab. 

3. M,p: M is a sorted (partial) list of currently discovered maximal points and p is the leftmost 
among those. Initially M is empty and p is a "null" point that dominates no input point. 

The algorithm involves a main procedure Search, and an auxiliary procedure Update. The 
procedure Search chooses a point and proceeds its search by a single step in the appropriate tree. 
Occasionally, it will invoke Update to change the global variables. The algorithm repeatedly calls 
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Search until L(A) is empty. After that, we perform a final call to Update in order to process any 
points that might still remain in B. 

Search. Let pi be obtained by performing a f ind-max in L(A). If the maximum key m in L(A) is 
less than the right boundary of A, we invoke Update. If pi is dominated by p, we delete pi from 
L(A). If not, we advance the search of pi in Tj by a single step, if possible. This updates the slab 
Si. If the right boundary of Si has decreased, we perform the appropriate decrease-key operation 
on L{A). (Otherwise, we do nothing.) 

Suppose the point p% reaches a leaf slab A. If A = A, we remove pi from L(A) and insert it in B 
(in time 0{\B\)). Otherwise, we leave pi in L{A). 

Update. We sort all the points in B and update the list of current maxima. As Claim 5.4 will 
show, we have the sorted list of maxima to the right of A. Hence, we can append to this list in 
0(|S|) time. We reset B = 0, set A to the leaf slab to the left of m, and return. 

The following claim states an important invariant maintained by the algorithm, and then give 
a construction for the data structure L(A). 

Claim 5.4. At any time in the algorithm, the maxima of all points to the right of A have been 
determined in sorted order. 

Proof. The proof is by backward induction on m, the right boundary of A. When m = \S\, then 
this is trivially true. Let us assume it is true for a given value of m, and trace the algorithm's 
behavior until the maximum key becomes smaller than m (which is done in Update). When 
Search processes a point p with a key of m then either (i) the key value decreases; (ii) p is 
dominated by p; or (hi) p is eventually placed in A (whose right boundary is m). In all cases, when 
the maximum key decreases below m, all points in A are either proven to be non- maximal or are 
in B. By the induction hypothesis, we already have a sorted list of maxima to the right of m. The 
procedure Update will sort the points in B and all maximal points to the right of m — 1 will be 
determined. □ 

5.2.1 Running time analysis 

The aim of this section is to prove the following lemma. 
Lemma 5.5. The algorithm runs in 0(n + OPT-MAXx>) time. 

We can easily bound the running time of all calls to Update. 
Claim 5.6. The expected time for all calls to Update is 0{n). 

Proof. The total time for all calls to Update is at most the time for sorting points within the leaf 
slabs. By Lemma 3.8, this takes expected time 



The important claim is the following, since it allows us to relate the time spent by Search to 
the entropy-sensitive comparison trees. Lemma 5.5 follows directly from this. 




□ 



17 



Claim 5.7. LetT be an entropy- sensitive comparison tree computing S-labeled maxima. Consider a 
leaf v labeled with the regions 1Z V = (Ri, R2, ■ ■ ■ , Rn), and let d v denote the depth of v. Conditioned 
on P G 1Z V , the expected running time of Search is 0(n + d v ). 

Proof. For each Ri, let Si be the smallest slab of S that completely contains Ri. We will show that 
the algorithm performs at most an S,-restricted search for input P G TZ V . lipi is maximal, then Ri 
is contained in a leaf slab (this is because the output is S-labeled). Hence Si is a leaf slab and an 
Si-restricted search for a maximal pi is just a complete search. 

Now consider a non- maximal pi. By the properties of S-labeled maxima, the associated region 
Ri is either inside a leaf slab or is separated by a slab boundary from the dominating region Rj. In 
the former case, an ^-restricted search is a complete search. In the latter case, we argue that an 
Si-restricted search suffices to process pi. This follows from Claim 5.4: by the time an ^-restricted 
search finishes, all maxima to the right of Si have been determined. In particular, we have found 
Pj, and thus p dominates pi. Hence, the search for pi will proceed no further. 

The expected search time taken conditioned on P G 7Z V is the sum (over i) of the conditional 
expected Sj-restricted search times. Let Ei denote the event that pi G Ri, and £ be the event that 
P G TZ V . We have £ = /\ { Ei. By the independence of the distributions and linearity of expectation 

n 

Eg [search time] = Eg [^-restricted search time for pi] 
i=i 

n 

= [Si-restricted search time for pi] . 

i=i 

By Lemma 3.7, the time for an S^-restricted search conditioned on pi G Ri is 0{— logPr[pj G -R,] + l). 
By Proposition 3.3, d v = ^ — logPrfpj G Ri], completing the proof. □ 

We can now prove the main lemma. 

Proof of Lemma 5.5. By Lemma 5.3, there exists an entropy-sensitive comparison tree T that com- 
putes the S-labeled maxima with expected depth 0(OPT-MAX + n). According to Claim 5.7, the 
expected running time of Search is 0(OPT-MAX + n). Claim 5.6 tells us the expected time for 
Update is 0{n), and we add these bounds to complete the proof. □ 

6 A self-improving algorithm for convex hulls 

We begin with an outline of the main ideas. The basic philosophy is the same as with the interleaved 
search for maxima. We set up a slab structure S, and each distribution has a dedicated tree for 
searching points. At any stage, each point is in some intermediate node of the search tree, and 
we wish to proceed searches for points that have the greatest potential for being on the convex 
hull. Furthermore, we would like to quickly ascertain that a point is not extremal, so that we can 
terminate its search. 

For maxima, this strategy is easy enough to implement. The "rightmost" point (among those 
being searched) is a good candidate for being maximal and we always proceed its search. We also 
maintain the leftmost maximal point currently known. Any time we continue the search of some 
point, we can always compare with this maximal point. Hence, any point that is dominated by the 
current set of maximal points is immediately removed. 
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For convex hulls, this is much more problematic. At any stage, there are many points with 
the potential of being extremal, and it is not clear how to choose between them. We also need a 
procedure that can be rapidly updated as we find new extremal points, so that we quickly certify 
non-extremal points that are currently being searched. 

To perform these operations, we construct a canonical hull C in the learning phase. Roughly 
speaking, this is a very crude guide for some properties of the actual convex hull. The canonical 
hull has two key properties. First, any point that is below C is likely to be non-extremal. Second, 
there are not too many points above C. We build the slab structure S based on C. The searches 
only place points above or below C. A point p is proven to be below C if we find a segment contained 
in C above p. For all points p above C, we perform more searching to find all edges of C visible from 
p (think of this as completely certifying that p is above C). This procedure is referred to as the 
location algorithm. At the end of this, we have some partial information about the various points. 
Then we apply a construction algorithm, that computes the convex hull using this information. 



6.1 The canonical directions 

This section describes all the structures obtained in the learning phase. In order to characterize the 
typical behavior of a random point set P ~ T>, we use a set V of canonical directions. A direction 
is a two-dimensional unit vector, and directions are ordered clockwise. The directions we consider 
will always point upwards. Given a direction v, we say that p G P is extremal for v if the scalar 
product {p, v) is maximum among all points in P. We denote the lexicographically smallest input 
point that is extremal for v by e v . The canonical directions are characterized by the following 
lemma, whose proof is postponed to Section 9.1. They are computed in the learning phase. (Some 
of the basic notation used here is given by Definition 2.2 and just above it.) 

Lemma 6.1. Let k := n/log 2 n. There is an 0(n poly (log n)) procedure that requires poly(logn) 
random inputs and outputs an ordered sequence V = v±,V2, . ■■ ,vj- of directions such that the fol- 
lowing holds (with probability at least 1 — ra -4 over construction). Let P ~ T> be a random input. 
For i = 1, . . . , k, let := e Vi G P, let X{ be the number of points from P inside uss(ei, e^i), and 
Yi the number of extremal points inside uss{ei, e, + i). Then 

k 



Ep^^A.log^ + l)] =0(nloglogn). 

i=i 



Given the canonical directions, we construct some special lines normal to them in the learning 
phase. The details are given in Section 9.2. 

Lemma 6.2. We can construct (in 0{n poly(logn)) time with a single sample input) lines l\, £2, 
. . ., £k, with £i normal to e^, and with the following property (with probability at least 1 — n -4 over 
construction). For i = 1, . . . , k, (and sufficiently large constant c) 



Pr [\£J DPI G [1, clog nil > l-ri 



-3 



We will henceforth assume that the learning phase succeeds, so the directions and lines obtained 
have the desired properties. We say that p G P is V -extremal if p = e v for some v G V. Using the 
canonical directions from Lemma 6.1 and the lines from Lemma 6.2, we construct a canonical hull C 
that is meant to be "typical" for a random P ~ D. We define the canonical hull C as the intersection 
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Figure 7: Left: The C-leaf slabs are shown in dashed lines. The shaded portion corresponds to a 
C-slab C. Right: The shaded region is the pencil of p and the area between the dashed lines in the 
pencil slab. The point q\ is above the pencil, q2 lies inside it, and q^ is not comparable to it. 



of the halfplanes below the £i, i.e., C := fllLi^"- ^ ^ s a convex polygonal region bounded by the 
lines ii. We state a simple corollary, that holds by taking a union bound of Lemma 6.2 over all i. 
Observe that it also implies that the total number of points outside C is 0(n/logn). 

Corollary 6.3. Assume the learning phase succeeds. With probability at least 1 — n~ 2 , the following 
holds. For all i, the extremal point for Vi is outside C. The number of pairs (p, s), where p £ P\C, 
s is an edge of C, and s is visible from p is 0(n/ log n). 

We now list some preliminary concepts related to C, see Figure 7. By drawing a vertical line 
through each vertex of C, we obtain a subdivision of the plane into vertical slabs. We call these 
slabs the C-leaf-slabs. A contiguous interval of C-leaf slabs again forms a vertical slab, which we 
call a C-slab. The C-leaf-slabs constitute the slab structure for the convex hull algorithm, and we 
use Lemma 3.9 to construct appropriate search trees T\, . . . ,T n for the C-leaf slabs and for each 
distribution Pj. 

For a C-slab C, we define seg(C, C) as the line segment that connects the two vertices of C that 
lie on the vertical boundaries of C. Let p be a point outside of C, and let a\ and 02 be the vertices 
of C in which the two tangents for C through p touch C. The pencil slab for p is the vertical slab 
bounded by the vertical lines through a\ and 02- The pencil of p is defined as the region inside 
the pencil slab for p that lies below the line segments aip and pa~2. A point q is comparable to the 
pencil of p if it lies inside the pencil slab for p. It lies above the pencil of p if it is comparable to 
the pencil of p but not inside it. 

6.2 Certificates 

We need more refined notions of certificates that relate to the canonical hull C. We remind the 
reader that a certificate for convex hulls has a sorted list of extremal points in P, and a witness 
pair for each non-extremal point in P. The points (q, r) form a witness pair for p if p E 1ss(q, r). 
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Figure 8: The different possibilities for the C-slab associated with p. The three shaded regions show 
the three possibilities for S p . In the left shaded region, p is contained in a leaf slab. The middle 
region shows a point that is below seg(C, S )- To the right, the point p lies in the pencil of e v . 

A witness pair (g, r) is called extremal if both q and r lie on conv(P). We call (q, r) V -extremal 
if both q and r are V-extremal. Two distinct extremal points q and r are called adjacent if there is 
no extremal point whose x-coordinate lies strictly between the x-coordinates of q and r. Adjacent 
V-extremal points are defined analogously. 

We now define a C-certificate for P. It consists of (i) a list of the V-extremal points of P, sorted 
from left to right; and (ii) a list that has a C-slab S p for every other point p G P. This C-slab S p 
contains p and can be of three different kinds. Either 

1. S p is a C-leaf slab; or 

2. p lies below seg(C, S p ); or 

3. Sp is the pencil slab for a V-extremal vertex e v such that p lies in the pencil of e v . 

The different possibilities are illustrated in Figure 8. The following key lemma is an important 
piece of the analysis. We relegate the proof to the next section. The reader may wish to skip that 
section and proceed to learn about the algorithm. 

Lemma 6.4. Assume C is obtained from a learning phase that succeeds. Let T be a linear compar- 
ison tree that computes the convex hull of P. Then there is an entropy-sensitive linear comparison 
tree with expected depth 0(n + dj-) that computes C- certificates for P. 

6.3 From regular certificates to C-certificates: proof of Lemma 6.4 

The proof proceeds through various intermediate steps that successively transform a regular cer- 
tificate into a C-certificate, such that each step incurs only linear overhead in expectation. Then it 
suffices to apply Lemma 3.4 to obtain an entropy-sensitive search tree of comparable depth. 

A certificate 7 is extremal if all witness pairs in 7 are extremal. We provide the chain of lemmas 
needed and give each proof in a different subsection. The following is proved in Section 6.3.1. 

Lemma 6.5. Let T be a linear comparison tree that computes conv(P). Then there exists a linear 
comparison tree with expected depth dj- + O(n) that computes an extremal certificate for P. 
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Figure 9: The shortcut operation: Observe that computing the convex hull of the out-neighbors 
of p, qi, and q2 suffice for removing p from all witness pairs. 

A certificate is V -extremal if it contains (i) a list of the V-extremal points of P, sorted from 
left to right; and (ii) a list that stores for every other point p £ P either a V-extremal witness 
pair for p or two adjacent V-extremal points e\ and e<i such that x{e\) < x{p) < x(e2). The next 
lemma is proved in Section 6.3.2. 

Lemma 6.6. Let T be a linear comparison tree that computes extremal certificates. Then there is 
a linear comparison tree with expected depth dj- + 0{n) that computes V-extremal certificates. 

The final lemma takes us from V-extremal certificates to canonical certificates. The proof is in 
Section 6.3.3. 

Lemma 6.7. Let T be a linear comparison tree that computes V-extremal certificates. Then there 
is a linear comparison tree with expected depth dj- + 0(n) that computes C -certificates. 

Lemma 6.4 follows by combining the above lemmas with Lemma 3.4. 
6.3.1 Extremal certificates 

Proof of Lemma 6.5. We transform T into a tree that computes extremal certificates. Since each 
leaf v of T corresponds to a certificate that is valid for all P with v = v(P), it suffices to show 
how to convert a given certificate 7 for P to an extremal certificate by performing 0(n) additional 
comparisons on P. We describe an algorithm for this task. 

The algorithm uses two data structures: (i) a directed graph G whose vertices are a subset of P; 
and (ii) a stack S. Initially, S is empty and G has a vertex for every p G P. For each non-extremal 
point p G P, we add two directed edges pq and pr to G, where (q, r) is the witness pair for p 
according to 7. In each step, the algorithm performs one of the following operations, until G has 
no more edges left (we will use the terms point and vertex interchangeably, since we always mean 
some p G P). 

• Prune. If G has a non-extremal vertex p with indegree zero, we delete p from G (together 
with its outgoing edges) and push it onto S. 
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Figure 10: The point p lies in the shaded region. Hence, either x(p) £ [x(q'),x(q")], p lies in 
lss(q",r'), or x(p) G [x(r'),x(r")]. 

• Shortcut. If G has a non-extremal vertex p with indegree 1 or 2, we find for each in-neighbor 
q of p a witness pair that does not include p and we replace the out-edges from q by edges to 
this new pair. (We explain shortly how to do this.) The indegree of p is now zero. 

An easy induction shows that our algorithm maintains the following invariants: (i) all non-extremal 
vertices in G have out-degree 2; (ii) all extremal vertices of G have out-degree 0; (iii) for each non- 
extremal vertex p of G, the two out-neighbors of p constitute a witness pair for p; (iv) every peP 
is either in G or in S, but never both; (iv) when a point p is added to S, then we have a witness 
pair (q, r) for p such that q,r ^ S. 

We analyze the number of comparisons on P required by each operation. Prune needs no 
comparisons. Shortcut is performed as follows: we consider for each in-neighbor q of p the upper 
convex hull U for p's two out-neighbors and g's other out-neighbor, and we find the edge of U 
that lies above q. Since the U constant size and since p has in-degree at most 2, this takes 0(1) 
comparisons, see Figure 9. There are at most n Shortcuts, so the total number of comparisons 
is 0(n). Note that deciding which operation to perform depends solely on G and requires no 
comparisons on P. 

We now argue that the algorithm cannot get stuck. That means that if G has at least one 
edge, Prune or Shortcut can be applied. Suppose that we cannot perform Prune. Then each 
non-extremal vertex has in-degree at least 1. Consider the subgraph G' of G induced by the non- 
extremal vertices. Since all extremal vertices have out-degree 0, all vertices in G' have in-degree at 
least 1. The average out-degree in G' is at most 2, so there must be a vertex with in-degree (in G') 
1 or 2. This in-degree is the same in G, so Shortcut can be applied. 

Thus, we can perform Prune and Shortcut until G has no more edges and all non-extremal 
points are on the stack S. Now we pop the points from S and find extremal witness pairs for them. 
Let p be the next point on S. By invariant (iv), there is a witness pair (q, r) for p whose vertices 
are not on S. Thus, each of q and r is either extremal or we have an extremal witness pair for it. 
Therefore, we can find an extremal witness pair for p with O(l) comparisons, as in Shortcut. We 
repeat this process until S is empty. This takes 0(n) comparisons overall, so we can construct an 
extremal certificate 7' from 7 with 0(n) comparisons on P. □ 
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6.3.2 V-Extremal Certificates 



Proof of Lemma 6.6. As in the proof of Lemma 6.5, it suffices to show how to convert a given 
extremal certificate into a V-extremal certificate with 0{n) comparisons on P. This is done as 
follows. First, we determine the V-extremal points on conv(P). This takes O(n) comparisons by 
a simultaneous traversal of conv(P) and V. Without further comparisons, we can now find for 
each extremal point p in P the two adjacent V-extremal points that have p between them. This 
information is stored in the V-extremal certificate. 

Now let p G P be non-extremal, and let (q, r) be the corresponding extremal witness pair. In 
O(l) comparisons, we will show how to either find a V-extremal witness pair, or the right pair of 
adjacent V-extremal points. 

We have determined adjacent V-extremal points q', q" such that x(q) G [x(q'),x(q")]. (If q is it- 
self V-extremal, just set q' = q" = q.) Similarly, define adjacent V-extremal points r', r" . We know 
that p lies in lss(g, r) and hence x(p) G [x(q),x(r)]. Furthermore, the points q' ,q,q" ,r' ,r,r" are all 
in convex position. Since p is in lss(g,r), one of the following must happen: x(p) G [x(q'),x(q")], p 
lies in lss(q",r'), or x(p) G [x(r'),x(r")]; see Figure 10. We can determine which in O(l) compar- 
isons. □ 



6.3.3 Canonical Certificates 

Proof of Lemma 6.7. Similar to the proofs of Lemmas 6.5 and 6.6, we convert a V-extremal cer- 
tificate 7, into a C-certificate with 0(n) expected comparisons on P. 

This works as follows. The certificate 7 provides a list of the V-extremal points in P. For 
each such V-extremal point, we perform a binary search to find the C-leaf slab that contains it. 
This requires o(n) comparisons, since there are at most nj log 2 n V-extremal points and since each 
binary search needs O(logn) comparisons. Next, we check, for each i <k, that the extremal point 
for V{ lies in if. This takes one comparison per point. If any of these checks fail, we simply declare 
failure and use binary search to find for every p G P a C-leaf slab that contains it. 

We now assume that there exists a V-extremal point in every if. (This implies that all V- 
extremal points lie outside C.) We use binary search to determine the pencil of each V-extremal 
point. Again, this takes o(n) comparisons. Now let p G P be not V-extremal. We will use 0(1) 
comparisons and either find the slab S p or determine that p lies above C. The certificate 7 assigns 
to p two V-extremal points e\ and e-i such that either (i) (ei,e2) is a V-extremal witness pair for 
p; or (ii) e\ and e2 are adjacent and x(e\) < x(p) < xfa). We define f\ be the rightmost visible 
point of C from e\ and fa be the leftmost visible point from ei. 

Let us consider the first case. Refer to Figure 11, left. The point p is below eje^. Since 
e i> fii fai e 2 are in convex position, eje^ is below the convex hull of these points. This means that 
one of the following must happen: x(p) G [x(ei),x(fi)], x(p) G [x(fa), xfa)], or p is below fafa. 
This can be determined in O(l) comparisons. In the first two cases, p lies in a pencil (and hence 
we find an appropriate S p ), and in the last case, we find a witness C-slab. Now for the second case. 
We will need the following claim. 

Claim 6.8. //, for all i, there exists a V-extremal point in if , then the pencils of two adjacent V 
either overlap or share a slab boundary. 

Proof. Refer again to Figure 11, left. Let e\ and e2 be two adjacent V-extremal vertices such that 
their pencil slabs do not overlap or share a boundary. Then f\ is not visible from &i- Consider 
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Figure 11: Canonical certificates: In each part, p is contained in the shaded region. 



the edge a of C where f± is the left endpoint. The edge a is not visible from either e\ or e2 and is 
between them. By the claim assumption, there exists an extremal point x of P that sees a. But 
the point x cannot lie to the left of e\ or to the right of e2 (that would violate the extremal nature 
of e% or e%) Hence, x must be between e± and e2, contradicting the fact that they are adjacent. □ 

With this claim, we can assert that p is comparable to one the pencils of e%, e<i- By 0(1) 
comparisons, we can check if p is contained in either of these pencils or is above C. 

Finally, for all points determined to be above C, we simply use binary search to place them 
in a C-leaf slab. So for each p, we have found an appropriate S p and the canonical certificate is 
complete. We analyze the total number of comparisons. Let X be the indicator random variable 
for the event that there exist some if that does not contain a V-extremal point. Let Y denote the 
number of points above C. By Corollary 6.2, E[X] < n~ 3 and E[Y] = 0(nj log n). The number of 
comparisons is at most 0(Xn log n + n + Y logn), the expectation of which is 0(n). □ 

6.4 The algorithm 

At long last, we have all the tools to describe the details of our convex hull algorithm. It has two 
parts: the location algorithm and the construction algorithm. The former algorithm determines the 
location of the input points with respect to the canonical hull C. It must be careful to learn just 
the right amount of information about each point, so that the running time is not too large. The 
latter algorithm uses the resulting information to compute the convex hull of P quickly. 

6.4.1 The location algorithm 

Using Lemma 3.9, we obtain near-optimal search trees T% for the slab structure induced by the 
C-leaf slabs. The algorithm searches progressively for each pi E P in its corresponding tree Tj. 
However, it is important to coordinate the searches carefully and to abort the search for a point pi 
as soon as we have obtained enough information about it. The location algorithm maintains the 
following information. 

• The current slabs Cj. For each point pi E P, we store a current C-slab Cj containing p$ 
that corresponds to a node of T{. 

• A set of active points A. The active points are stored in a priority- queue L(A) as in 
Claim 3.10. The key associated with an active point pi E A is the size of the associated 
current slab C, (represented as an integer between 1 and k). 
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Figure 12: The algorithm: the boundary of C% is shown dashed, the pencil pen(p Sa ) is shaded. 



• The extremal candidates e v . For each canonical direction v S V, we store a point e v E P 
that lies outside of C. We call e v an extremal candidate for u. 

• The pencils for the points outside of C. For each point p that has been located outside 
of C, we store its pencil pen(p). 

• The points with the left- and rightmost pencils. For each edge s of C, we store two 
points psi and p S 2 such that (i) p s \ and p S 2 lie outside of C; (ii) both pen(p s i) and pen(p S 2) 
contain s; (hi) among all pencils seen so far that contain s, the left boundary of pen(p s i) lies 
furthest to the left; (iv) among all pencils seen so far that contain s, the right boundary of 
pen(j? S 2) lies furthest to the right. 

Initially, we set A = P and each Cj to the root of the corresponding search tree T, . The extremal 
candidates e v as well as the points p s i,p S 2 with the left- and rightmost pencils are initialized to the 
null pointer. Since no point has been located outside of C so far, there is no pencil pen(p) yet. 

The location algorithm proceeds in rounds. In each round, we perform a find-max on L(A). 
Suppose that find-max returns pi. We compare pi with the vertical line that corresponds to its 
current node in Tj and advance Cj to the appropriate child. This reduces the size of Q, so we also 
need to perform an appropriate decrease-key on L(A). Next, we distinguish three cases: 
Case 1: pi lies below seg(C,Cj). We declare pi inactive and delete it from L(A). 

For the next two cases, we know that p% lies above seg(C, Ci). Let £ a , £b be the canonical lines 
that support the edg es s a and of C that are incident to the boundary vertices of Cj and lie inside 
of Cf, see Figure 12. We now check where pi lies with respect to i a and l^. 

Case 2: p, L is above t a or above if,. We now know that pi lies outside of C. We declare pi inactive 
and delete it from L(A). Next, we perform a binary search to determine pen(pj) and to find all 
the edges of C that are visible from p^. For each such edge s, we compare pi with the extremal 
candidate for s, and if pi is more extreme in the corresponding direction, we update the extremal 
candidate accordingly. We also update the points p s \ and p S 2 to pi, if necessary. 
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Case 3: pi lies below l a and If,. Recall that l a corresponds to the edge s a of C and lb corresponds 
to the edge Sb of C. We take the rightmost pencil for s a and the leftmost pencil for s& (if they 
exist). This is shown in Figure 12. We compare pi with these pencils. If pi lies inside one of the 
two pencils, we are done. If pi is above one of the two pencils, we learn that pi lies outside of C, 
and we process as in Case 2. In both situations, we declare pi inactive and delete it from L(A). 
If neither of these happen, pi remains active. 

The location algorithm continues until A is empty (note that every point becomes inactive 
eventually, because as soon as Cj is a leaf slab, either Case 1 or Case 2 must apply). 

6.4.2 Running time of the location algorithm 

We now analyze the running time of the location algorithm. We begin with some preliminary claims 
about the behavior of the algorithm. We note that the algorithm is itself deterministic, and hence 
we can talk of deterministic properties of the behavior on any input. 

Claim 6.9. Consider the extremal point e v (in an input P) and let its pencil slab be S. Suppose 
the search for e v reaches a slab D such that \D\ < \S\. The algorithm will detect that e v is an 
extremal point (for direction v). 

Proof. At least one vertical boundary line of D must be inside (the closure of) S and D n S must 
contain at least one leaf slab. By the definition of a pencil slab, e v sees all edges of C in DdS, so one 
of the edges s a or Sb corresponding to D, as defined in the algorithm (refer to Figure 12), must be 
visible to e v . Hence, e v lies in l^ U it and this is detected in Case 2 of the location algorithm. □ 

Claim 6.10. Consider a point p contained in the pencil of an extremal point e v . Let the pencil slab 
of e v be S and suppose the search for p reaches a slab D such that \D\ < \S\. Then, in the next 
round that p is processed, p becomes inactive. 

Proof. Consider the situation when the search for p has reached D, where \D\ < \S\ and this round 
completes. The location algorithm schedules the points according to the size of the current slab. 
Therefore, when p is processed again, all other active points are placed in slabs of size at most 
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But, by Claim 6.9, if e v is ever placed in slab of size at most |S|, then the algorithm detects that 
it is extremal and makes it inactive. 

Therefore, when p is processed next, e v has been determined to be the extremal point in the 
v direction. Note that D n S ^ 0, since p £ D n S. Some boundary (suppose it is the left one) 
of D lies inside S. Let s a be the corresponding edge of C, as used by the location algorithm; see 
Figure 13. Since s a is visible from e v , and since e v has been processed already, it follows that the 
pencil slab of the rightmost pencil for s a spans all of D n S. In Case 3 of the location algorithm (in 
this round), either p will be found inside this pencil slab, or will be found outside C. Either way, p 
becomes inactive. □ 

We arrive at the main lemma of this section. 

Lemma 6.11. The total number of rounds in the location algorithm is 0(n + OPT). 

Proof. Let T be an entropy-sensitive linear comparison tree that computes a C-certificate for P in 
expected time 0(n + OPT). Such a tree exists by Lemma 6.4. 

Let v be a leaf of T ■ By Proposition 3.1, we can write 7Z V as a Cartesian product 1Z V = Y\a=\ Ri- 
The depth of v is d v = — Y17=i 1°§ P r [P« £ Ri], by Proposition 3.3. Now consider a random input 
P, conditioned on P £ TZ V . We will show that expected number of rounds for P is 0(n + d v ). This 
implies the lemma, because the expected number of rounds is 

Pr[P £K v ]0(n + d v ) = 0(n + d T ). 

v leaf of T 

Let 7 be the C-certificate corresponding to v. The main technical argument is summarized in 
the following claim. 

Claim 6.12. Consider a point pi G P, where P £ TZ V . The number of rounds involving pi is at 
most one more than the number of steps required for an S Pi -restricted search forpi in Ti. 

Proof. By definition of the canonical certificates, S Pi is of three types. Either S Pi is a C-leaf slab, 
Pi is below seg(S Pi ,C), or S Pi is a pencil slab of an extremal vertex. In all cases, S Pi contains Ri. 
When S Pi is a leaf slab, then an 5 Pi -restricted search for p is just a complete search. Hence, this 
is always at least the number of rounds involving pj. Suppose pi is below seg(S Pi ,C). For any slab 
S C S Pi , seg(S,C) is above seg(S Pi ,C). If pi is located in any slab S C S Pi , it is made inactive (Case 
1 of the algorithm). 

Now for the last case. The slab S Pi is the pencil slab for a V-extremal vertex e v , such that the 
pencil of e v , pen(e„), contains p{. Suppose the search for p% leads to slab D C S Pi and pi is still 
active. By Claim 6.10, since \D\ < \S Pi \, pt becomes inactive in the next round. □ 

Suppose that P is chosen randomly from TZ V . The distribution restricted to pi is simply random 
from Ri. By Lemma 3.7, the expected S Pi -restricted search time is 0(1 — logPr[p G Ri]). Combining 
with Claim 6.12, the expected number of rounds is 

n 

0{n-Y J ^gVi[pi£R i ]) = 0{n + d v ), 
i=i 

as claimed. The lemma follows. □ 
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Figure 14: The lines perpendicular to direction v\ and V2 define the upper boundary of the shaded 
region, where p lies. All edges seen by a point in the shaded region can be seen by either e\ or ei- 



Lemma 6.13. The expected running time of the location algorithm is 0(n + OPT). 

Proof. By Claim 3.10, the total overhead for the data structure operations in linear in the number 
of rounds. The time to implement Case 1 and Case 3 is 0(1), since this just requires comparing 
Pi with a constant number of lines. Hence, the total time for this is at most the total number of 
rounds (up to a constant factor). 

For Case 2, we perform a binary search for pi and an extremal point (and pencil) update for 
every edge s visible from p^. The binary search only happens if pi lies outside C. By Corollary 6.3, 
the expected number of extremal point updates is 0(n/ log n). Overall, the total overhead of Case 2 
operations is 0(n). Combining with Lemma 6.11, the expected running time is 0{n + OPT). □ 

6.4.3 The construction algorithm 

We now describe the convex hull construction that uses the information from the location algorithm 
in order to compute conv(P) quickly. First, we dive into the geometry of pencils. 

Claim 6.14. Suppose that all V -extremal points of P lie outside ofC, and let e v be a V -extremal 
point. Then e v does not lie in the pencil of any other point p € P outside C. 

Proof. Suppose that e v G pen(p) for another point p S P that lies outside C. Then a vertex of 
pen(p) would be more extremal in direction v than e v . It cannot be p, since then e v would not be 
extremal in direction e v . But it also cannot be a vertex of C, because e v lies in ££, while all vertices 
of C lie on £ v or in £~. Thus, p cannot exist. □ 

Claim 6.15. Suppose that all V -extremal points of P lie outside of C. Let e\ and a be two 

adjacent V -extremal points and let p £ P be above C such that the x-coordinate of p lies between 
the x- coordinates of e\ andei- Then, the portion of pen(p) below C is contained in pen(ei)L)pen(e2) ■ 

Proof. By Claim 6.8, the (closures of the) pencil slabs of e\ and ei overlap. Let v\ be the last 
canonical direction for which e\ is extremal and V2 be the first canonical direction for which e2 is 
extremal. Since e\ and e^ are adjacent, v\ and V2 are consecutive directions in V; see Figure 14. 
Consider the convex region bounded by the vertical downward ray from e%, the vertical downward 
ray from e2, the line parallel to £ Vl through ei and the line parallel to £ V2 through e<2- By con- 
struction, p lies inside this convex region (the shaded area in Figure 14). By convexity, for every 
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canonical direction v G V, at least one of e\ or e<i is more extremal with respect to v than p. Hence, 
any edge of C visible from p is visible by either ei or C2- The portion of pen(p) below C consists of 
the union of regions below edges of C visible from pen(p). Therefore, it lies in pen(ei)Upen(e2). □ 

As described in Section 6.4.1, when the location algorithm is done, for each p G P we know that 
either (a) p lies outside of C; (b) p lies inside of C, as witnessed by a segment seg(C, C p ); or (c) p lies 
inside the pencil of a point that we located outside of C. We also have the extremal vertex e v for all 
v G V. We will now show how to use this information in order to find conv(P). By Corollary 6.3, 
with probability at least 1 — n -2 , for each canonical direction in V there is a extremal point outside 
of C and that the total number of points outside C is 0{n/ log n). We will henceforth assume these 
conditions to be true. (If they do not hold, we can compute conv(P) in O(nlogn). This affects the 
expected running time by a lower order term.) 

For any point a, the V-pair for a is the pair of adjacent V-extremal points such that a lies 
between them. The construction algorithm involves a series of steps. The exact details of some of 
these steps will be given in subsequent claims. 

1. Compute the convex hull of the V-extremal points. 

2. For each vertex a of C, compute the V-pair for a. 

3. For each input point p outside C, compute its V-pair by binary search. 

4. For each input point p below a segment seg(C,C p ), in O(l) time, either find its V-pair or 
find a segment between V-extremal points above it. (Details in Claim 6.18.) 

5. For each input point p located in the pencil of a non- V-extremal point, in 0(1) time, either 
locate p in the pencil of a V-extremal point or determine that it is outside C. In the latter case, 
use binary search to find its V-pair. 

6. For each input point p located in the pencil of an V-extremal point, in O(l) time, find a seg- 
ment between V-extremal points above it or find its V-pair. (Details for both steps in Claim 6.19.) 

7. By now, for every non- V-extremal p G P, we have found a V-pair or proved it is non-extremal 
by providing a V-extremal segment about it. For every pair (ei, e^) of adjacent V-extremal points, 
determine the set Q of points that lie above e\e%. We then take an output-sensitive convex hull 
algorithm [20] to find the convex hull of Q. Finally, we concatenate the resulting convex hulls to 
obtain conv(P). 

Claim 6.16. Once the location algorithm has finished, for each canonical direction v G V , the 
extremal candidate e v is the actual extremal point e v in direction v. 

Proof. By Claim 6.14, e v does not lie in the pencil of any other point p G P. Hence, the location 
algorithm must have classified e v as the extremal candidate for v at some point, and this choice 
does not change in the further execution of the algorithm. □ 

Claim 6.17. The total running time for Steps 1,2,3 and all binary searches in Step 5 is 0(n). 

Proof. There are k = n/log 2 n V-extremal points, so computing their convex hull takes 0{n) time. 
We simultaneously traverse C and the convex hull of the V-extremal points to determine V-pairs 
for all vertices of C. We assumed that there are 0(n/ log n) points outside C, so that total time for 
binary searches is 0(n). □ 
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Figure 15: Left: In Step 4, the region below seg(C, Cj) is either below the middle segment of Z or 
between one of the two V-pairs. Right: In Step 6, pen(g) can be partitioned into regions below 
qr[ , below qr~2 , between r± , r[ , or between T2 , r 2 . 

Claim 6.18. Suppose input point p lies below a segment seg(C,C p ). Using the information gathered 
before Step 4, we can either find its V-pair or a segment between V-extremal points above it in 0(1) 
time. 

Proof. Let a and b be the endpoints of seg(C, C p ). Consider the convex hull Z of the at most four 
V-extremal points that define the V-pairs of a and b; see Figure 15, left. The hull Z has at most 
three edges, and only the middle one (if it exists) might not be between two adjacent V-extremal 
points. If the middle edge of Z exists, it must lie strictly above seg(C,C p ). This is because the 
endpoints of the middle edge have x-coordinates between x(a) and x(b) and lie outside of C (since 
they are V-extremal), while seg(C, C p ) is inside C. Now we compare p with the convex hull Z. This 
either finds a V-pair for p (if p lies in the interval corresponding to the leftmost or rightmost edge 
of Z) or shows that p lies below a segment between two V-extremal points (if it lies in the interval 
corresponding to the middle edge of Z). □ 

Claim 6.19. Suppose input point p is contained in pen(q), where q is above C, and the construction 
algorithm has completed Step 4- If Q * s non- V-extremal, in O(l) time, we can either find a V- 
extremal point q' such that p 6 pen(q'), or determine that p is above C. If q is V-extremal, then in 
O(l) time we can find a V-segment above p or find the V-pair for p. 

Proof. Suppose q is not V-extremal. Since q lies outside of C, we already know the V-pair {e\, e{\ 
for q. By Claim 6.15, if p is below C, then it is either in pen(ei) or pen(e2). We can determine 
which (if at all) in 0(1) time. 

Suppose q is V-extremal. Let a and b be the points of C on the boundary of pen(g), where a 
is to the left: see Figure 15, right. Let (?"i,r^) be a's V-pair, where r% is to the left. Similarly, 
(r2,r' 2 ) is &'s V-pair. The segments qr^ and qf~2 are above pen(g). Furthermore, the pencil slab of 
q is between n and r' 2 . One of the following must be true for any point in pen(g): it is below qr' t , 
below qf~2, between (ri,r[), or between (r2,r' 2 ). This can be determined in 0(1) time. □ 

We are now armed with all the facts to bound the running time. 

Lemma 6.20. With the information from the location algorithm, conv(P) can be computed in 
expected time 0(n log log n). 
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Proof. By Claims 6.17, 6.18, and 6.19, the total running time of the first six steps is 0(n). Let the 
V-extremal points be ordered ej, ea, . . .. Let Xi denote the number of points in uss(ej, e^+i) and Yi 
be the number of extremal points in this set. Since we use output-sensitive algorithms to compute 
the hulls of these sets, the running time of Step 7 is 0(^2 i<k Xilog(Yi + 1)). By Lemma 6.1, this 
is O(nloglogn), as desired. □ 



7 Restricted searches 

In this section, we prove Lemma 3.7. 

Proof of Lemma 3. 7. We bound the expected number of visited nodes in an 5-restricted search. 
Let v be a node of T. In the following, we will use q v and s v as a shorthand for the values qs v and 
ss v - Let vis(u) be the expected number of nodes visited below v, conditioned on v being visited. 
We will prove below, by induction on the height of v, that for all visited nodes v with q v < 1/2, 

vis(u) < ci + clog(q v /s v ), (1) 

for some constants c, c\ > 0. 

Given (1), the lemma follows easily: since T is //-reducing, it follows that for v at depth k, we 
have q v < fi . Hence, we have q v < 1/2 for all but the root and at most l/log(l//x) nodes below 
it (note that at each level of T there can be at most one node with q v > 1/2). Let W be the set 
of nodes w of T such that q w < 1/2, but q w i > 1/2, for the parent w' of w. Since T has bounded 
degree, \W\ = 0(l/log(l///)). The expected number vis(T) of nodes visited in an ^-restricted 
search is at most 

vis(T) < l/log(l/p) + V Pr[j G 5^]vis(^) 

— J^s 

w&W 

< 1/ log(l//i) +ci + cV Pr[j G Su,] log(g^/s w ) 

< l/log(l//i) + ci + c V Pr[j G S w ] log(l/ Sw ), 

using (1) and qr w < 1. By definition of we have Prjr s [j G S^] = s w /ss, so 
vis(T) < 1/ log(l//i) + ci + c V — Iog(l/s tt ) 

= l/log(l//i) +ci+c V —{\og(s s /s w ) - log s s ). 

The sum ^« ) eVK( s «'/' s 5') l°g( s s/ s w) represents the entropy of a distribution over W. Hence, it is 
bounded by log|W|. Furthermore, Ylwew Sw — s s> so 

vis(T) < 1/ log(l//i) + ci + log \W\ - clog s 5 = 0(1 -log s s ), 

as desired. 

It remains to prove (1). For this, we examine all possible paths down T that an ^-restricted 
search can lead to. It will be helpful to consider the possible ways that S can intersect the intervals 
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Figure 16: (a) The intersections S D S v in (i)-(iii) are trivial, the intersections in (iii) and (iv) are 
anchored; (/3) every node of Tj has at most one non-trivial child, except for r. 

corresponding to the nodes that are visited in a search. Say that the intersection S n S v of S with 
interval S v is trivial if it is either empty, S, or S v . Say that it is anchored if it shares at least one 
boundary line with S. Suppose S n S v = S v . Then the search will terminate at v, since we have 
certified that j 6 S. Suppose S D S v = S, so S is contained in S v . There can be at most one child 
of v that contains S. If such a child exists, then the search will simply continue to this child. If 
not, then all possible children (to which the search can proceed to) are anchored. The search can 
possibly continue to any child, at most two of which are internal nodes. Suppose S v is anchored. 
Then at most one child of v can be anchored with S. Any other child that intersects S must be 
contained in it. Refer to Figure 16. 

Consider the set of all possible nodes that can be visited by an ^-restricted search (remove all 
nodes that are terminal, i.e., completely contained in S). These form a set of paths, that form some 
subtree of S. In this subtree, there is only one possible node that has two children. This comes 
from some node r that contains S and has two anchored (non-leaf) children. Every other node of 
this subtree has a single child. Again, refer to Figure 16. We now prove two lemmas. 

Claim 7.1. Let v ^ r be a non-terminal node that can be visited by an S -restricted search, and let 
w be the unique non-terminal child of v. Suppose q v < 1/2 and vis(iu) < c\ + c\og{q w / s w ) . Then, 
for c > ci/log(l/^), we have 

vis(u) < 1 + c\og(q v /s v ). (2) 

Proof. From the fact that when a search for j shows that it is contained in a node contained in S, 
the S'-restricted search is complete, it follows that 

vifl(xO < 1 + |^^§T vis W = 1 + -visH- (3) 
Using the hypothesis, if follows that 

vis(v) < 1 + — (ci + clog(q w /s w )). 

s v 

Since q w < [iq v , and letting [3 := s w /s v < 1, this is 

< 1 + /3(ci + c\og(q v /s w ) + clogyu) 

= 1 + Pa + (3clogq v + /3clog(l/s w ) + /3c log//. 
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The function x \-t xlog(l/x) is increasing in the range x £ (0, 1/2), so s w \og(l/s w ) < s v log(l/s v ) 
for s v < q v < 1/2. Together with /3 = s w /s v < 1, this implies 

vis(v) < 1 + /3ci + clogc^ + clog(l/s„) + /3c log /i 
= 1 + clog(g v /s„) + /3(ci + clog//) 
< 1 + clog(q v /s v ), 

for c > ci/ log(l//t). □ 

Only a slightly weaker statement can be made for the node r having two nontrivial intersections 
at child nodes n and r2- 

Claim 7.2. Lei r be as above, and let ri,T2 be the two non-terminal children of r. Suppose that 
vis(r-j) < ci + clog(q r Js ri ), for i = 1,2. Then, for c > c\j log(l//i) ; we /lave 

vis(r) < 1 + clog(g r /s r ) + c. 

Proof. Similar to (3), we get 

vis(r) < 1 + — vis(ri) + — vis(r2). 

Sf Sy 

Applying the hypothesis, we conclude 

2 



vis(r) < 1 + V — [ci + clog(g ri /s n )]. 
i=l Sr 

Setting P := (s ri + s r2 )/s r and using q r . < fiq r , we get 

2 

vis(r) < 1 + Pa + /3c log// + /3clogg r + c^(s r4 /s r ) log(l/s r J. 



The sum is maximized for s ri — s r2 — s r /2, so using once again that /3 < 1, it follows that 

vis(r) < 1 + /3ci + /3c log // + /3clogg r + clog(2/s r ) 

< 1 + /3(ci + clog/i) + clog(5 r /sr.) + clog 2 

< 1 + clog(q' r ./s r ) + c, 

for c > ci/ log(l//u), as in (2), except for the addition of c. □ 

Now we use Claims 7.1 and 7.2 to prove (1) by induction. This bound clearly holds for leaves. 
For the visited nodes below r, we may then inductively take c\ = 1 and c = l/log(l//x), by 
Claim 7.1. We can then apply Claim 7.2 for r. For the parent v of r, we may use Claim 7.1 
with ci = 1 + l/log(l///) and c > c\j log(l//u), getting vis(f) < 1 + clog(gv/st,). Now repeated 
application of Claim 7.1 (with the given value of c) gives that this bound also holds for the ancestors 
of v, at least up until the 1 + l/log(l//i) top nodes. This finishes the proof of (1), and hence the 
lemma. □ 
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8 Construction of slabs and search trees 



Learning the vertical slab structure S is very similar to learning the V-list in Ailon et al. [2, 
Lemma 3.2]. We repeat their construction: think of each T>i as a distribution generating a single 
number, the x-coordinate of the sampled point. Take the union of the ^-coordinates of the first 
t = logn inputs Pi, Pz, . . ., Pt- Let the sorted list be —00 =: xq,xi, . . . ,x n t,x n t+i := +00. 

Take the n values xq, xt, x^t, ■ ■ ■ , x ( n -i)t- This is called the V-list, and it defines the boundaries 
for S. We repeat the learning phase result of the self-improving sorter [2, Lemma 3.2]. 

Lemma 8.1. For < j < n, let Zj = {xi\vj < x% < Vj+x} be the elements with predecessor Vj. 
With probability at least 1 — n~ 2 over the construction of the V-list, we have Ed[\ZA] = 0(1) and 
E v [\Zj\ 2 } = 0(1), for allO<j <n. □ 

Lemma 3.8 follows immediately from Lemma 8.1 and the fact that sorting the k inputs P\, P2, 
. . ., Pk takes 0(nlog 2 re) time. After the leaf slabs have been determined, the search trees Tj can 
be found using essentially the same techniques as before [2, Section 3.2]. The idea is to use n e logn 
rounds to find the first elogra levels of Tj, and to use a balanced search tree for searches that need 
to proceed to a deeper level. This only costs a factor of e^ 1 . We restate Lemma 3.9 for convenience. 
The proof is almost the same as that in [2, Section 3.2], but we redo the proof for our setting. (We 
require the additional property of restricted search optimality.) 

Lemma 8.2. Let e > be a fixed parameter. In 0(n £ ) rounds and 0(n 1+e ) time, we can con- 
struct search trees T\, T<i, . . ., T n over S such that the following holds: (i) the trees can be totally 
represented in 0(n 1+£ ) space; (ii) probability 1 — n -3 over the construction of the Tis: every Ti is 
0(1/ e)- optimal for restricted searches over XV 

Proof. Let 5 > be some sufficiently small constant and c be sufficiently large . For k = c5~ 2 n £ log n 
rounds and each pi, we record the leaf slab of S that contains it. We break the proof into smaller 
claims. 

Claim 8.3. Using k inputs, we can compute estimates q(i, S) for each index i and slab S. The 
following guarantee holds (for all i and S ) with probability at least 1 — n~ 3 over the choice of the k 
inputs. If at least (c/We5 2 ) logn instances of pi fell in S , then q(i, S) € [(1— S)q(i, S), (l+5)q(i, S)]. 2 

Proof. For a slab S, let N(i, S) be the number of times pi was in S, and let q(i, S) = N(i, S)/k be 
the empirical probability for this event. Note that E[N(i, S)] = kq(i, S) and is a sum of independent 
random variables. 

If kq(i,S) < (c/10e5 2 )logra, then (c/55 2 )logn > 2eE[JV(t, S)}. By a Chernoff bound [15, 
Theorem 1.1, Eq. (1.8)], Pr[JV(i,5) > (c/55 2 ) log n] < 2-( c / 552 ) lo g" < n ~ 6 . Hence, with probability 
at least 1 - n" 6 , if N(i, S) > (c/55 2 ) logn, then E[N(i,S)] > (c/We5 2 ) logn. 

Assume E[A r (z, S)] > (c/10e<5 2 ) logn. Using multiplicative Chernoff bounds [15, Theorem 1.1, 
Eq. (1.7)], Pr[N(i,S) $ [(1 - d)E[N{i, S)), (1 + S)E[N(i, S)}]] < 2 exp(-5 2 B[N(i, 5)]/3) < n" 6 . 
The proof is completed by taking a union bound over all i and S. □ 

Henceforth, assume that the high-probability event of this claim holds. Note that if (c/10e<5 2 ) log n 
inputs fell in S, then q(i, S) = Q(n~ 6 ) and q(i, S) = f2(n _e ). The tree Tj is constructed recursively. 
We will first create a partial search tree, where some searches may end in non-leaf slabs (or, in 

2 We remind the reader that this the probability that p t G S. 
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other words, leaves of the tree may not be leaf slabs). The root is the just the largest slab. Given 
a slab S, we describe the creation of the sub-tree of Tj rooted at S. If N(S) < (c/10e5 2 ) logn, then 
we make S a leaf. Otherwise, we pick a leaf slab A such that for the slab Si consisting of all leaf 
slabs (strictly) to the left of A and the slab S r consisting of all leaf slabs (strictly) to the right of 
A we have q(i, Si) < (2/3)q(i, S) and q(i, S r ) < (2/3)q(i, S). We make A a leaf child of S. Then we 
recursively create trees for Si and S r and attach them as children to S. For any internal node of 
the tree S, we have q(i, S) = f2(n e ), and hence the depth is at most O(elogn). Furthermore, this 
partial tree is /3-reducing (for some constant /3). The partial tree Tj is extended to a complete tree 
in a simple way. From each Tj-leaf that is not a leaf slab, we perform a basic binary search for the 
leaf slab. This yields a tree Tj of depth at most (1 + 0(e)) logn. Note that we only need to store 
the partial Tj tree, and hence the total space is 0(n 1+e ). 

Let us construct, as a thought experiment, a related tree Tj. Start with the partial Tj, For every 
leaf that is not a leaf slab, extend it downward using the true probabilities q(i, S). In other words, 
let us construct the subtree rooted at a new node S in the following manner. We pick a leaf slab A 
such that q(i,Si) < (2/3)q(i, S) and q(i,S r ) < (2/3)q(i, S) (where Si and S r are as defined above). 
This ensures that Tj is /3-reducing. By Lemma 3.7, Tj is 0(l)-optimal for restricted searches over 
T>i (we absorb the (3 into 0(1) for convenience). 

Claim 8.4. The tree Tj is 0(1/ e)- optimal for restricted searches. 

Proof. Fix a slab S and an 5-restricted distribution T>s- Let q'(i, A) (for each leaf slab A) be 
the series of values defining T>s- Note that q'(i,S) < q(i,S). Suppose q'(i,S) < n~ £ / 2 . Then 
— log q'(i,S) > e(logn)/2. Since any search in Tj takes at most (1 + 0(e))logn steps, the search 
time is at most 0(e^ 1 (— log q'(i, S) + 1)). 

Suppose q'(i,S) > n~ £ / 2 . Consider a single search for some p%. We will classify this search 
based on the leaf of the partial tree that is encountered. By the construction of Tj, any leaf S' is 
either a leaf slab or has the property that q(i, S') = 0(n~ £ ). The search is of Type 1 if the leaf of 
the partial tree actually represents a leaf slab (and hence the search terminates). The search is of 
Type 2 (resp. Type 3) if the leaf of the partial tree is a slab S is an internal node of Tj and the 
depth is at least (resp. less than) e(logn)/3. 

When the search is of Type 1, it is identical in both Tj and T[. When the search is of Type 2, it 
takes at least e(logn)/3 steps in T[ and at most (trivially) (1 + 0(e))(logn) in Tj. Consider Type 3 
searches. The total number of leaves (that are not leaf slabs) of the partial tree at depth less than 
e(logn)/3 is at most ra e / 3 . The total probability mass of T>{ inside such leaves is 0(n £ / 3 x n~ s ) < 
0(n" 2£ / 3 ). Since q'(i,S) > n £ / 2 , in the restricted distribution T>s, the probability of a Type 3 
search is at most 0(n~ £ / 6 ). 

Choose a random p ~ T>s- Let 8 denote the event that a Type 3 search occurs. Furthermore, 
let X p denote the depth of the search in Tj and X' p denote the depth in T[. When 8 does not occur, 
we have argued that X p = 0(X' p /e). Also, Pr(£) = 0(n -£ / 6 ). The expected search time is just 
E[X P ]. By Bayes' rule, 

nX p }=Pr(8)E I [X p }+PT(8)B £ [X p }<0(e- 1 E I [X p }) + n- £ /\l + 0(e))logn 

= 0(e- 1 E I [X' p ] + 1) 

Since Pr(£) > 1/2, < 2Pr(f)%[^] < 2E[X' p ]. Combining all the arguments, the expected 

search time is 0(e- 1 (E[X' p ] + 1)). Since T- is 0(l)-optimal for restricted searches, Tj is 0(e )- 
optimal. □ 
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9 Learning phase for convex hulls 

Here we provide the deferred proofs for the lemmas stated in Section 6.1. We begin with some 
preliminaries about projective duality and an important probabilistic claim about geometric con- 
structions over product distributions. 

Consider an input P. As is well known, there is a dual set P* of lines that helps us understand 
the properties of P. In particular, we use the standard duality along the unit paraboloid that maps 
a point p = (x(p),y(p)) to the line p* : y = 2x(p)x — y(p) and vice versa. The lower envelope of 
P* is the pointwise minimum of the n lines Pi,p2, ■ ■ ■ ,Pni considered as univariate functions. We 
denote it by levo(P*). It is a classic fact that there is a one-to-one correspondence between the 
vertices and edges of levo(-P) and the edges and vertices of conv(P). More generally, for z = 0, . . . n, 
the z-level of P* is the closure of the set of all points that lie on lines of P* and that have exactly 
z lines of P* strictly below them. The z-level is an x-monotone polygonal curve, and we denote it 
by lev 2 (P*). (Refer to Figure 17.) Finally, the (<z)-level of P* , lev< 2 (P*), is the set of all points 
on lines in P* that are on or below lev z (P*). 

Consider the following abstract procedure. Fix some absolute constant b. For any set S of b 
lines, let reg(S') be some geometric region defined by the lines {£{ \ i G S}. In other words, reg(-) 
is a function from sets of lines of size b to geometric regions (which is just some set in M 2 ). For 
example, reg(-) may be a triangle or trapezoid formed by some lines in S. Consider some such 
region R and a line i. Let x(A R) be some boolean function, taking as input a line and a geometric 
region. 

Suppose we generate a random instance Q* ~ V. We apply some procedure to determine 
various subsets Si,S%, ... of 6 lines from S. These are chosen based on the values of the sums 
^2i<n x(lh re s('Sj'))- Now generate another random instance P* ~ V. What can be say about the 
values of ^ x{p\ > re §(>Sj))? We expect them to resemble Yli xilt-, re g(5j))i but we have to deal 
with subtle issues of dependencies. In the former case, Sj actually depends on Q*, while in the 
latter case it does not. Nonetheless, we can apply concentration inequalities to make statements 
about EiX(K,reg(S j )). 

Let J be a set of b indices in [n], and set Q*j := {q* \ j G J}. The following lemma can be seen 
as generalization of Lemma 8.1. The proof is quite analogous to the respective one in [2]. 

Lemma 9.1. Fix a constant integer b > 0. Let fi(n), f u (n) be increasing functions such that for 
all sufficiently large n, f u (n) > fi(n) > c'&logn (for sufficiently large constant c'). The following 
holds with probability at least 1 — ra~ 4 over a random Q* ~ T>. For all index sets J of size b, if 
J2i< n x(lt > re 9(Q*j)) ^ [/iWi/uHL then for some absolute constant a S (0,1), 

^J52x{p*,reg{Q*j)) G [aMn), f u (n)/a\] > l-n' 3 . 

i<n 

Proof. Fix an index set J of size b. Condition on the set of lines S = {q*j\j G J} being fixed. Note 
that the distributions T>i, i ^ J remain the same because of independence. Consider the region 
reg(S'), which is simply some fixed region. Generate a random Q* conditioned on Qj = S. (This 
means that we just generate random lines q*, for i ^ J.) Focus on the sum xiQ.1-, ie s(S)). 

Note that | £^jX(?*,reg(S)) _ £iX(tf,reg(S))| < h . Suppose X (q*, reg(S)) G [fi(n)J u (n)], 
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then Ei£jX(<?*,reg(S)) G \gi(ri) , g u (ri)] (where gi(ri) = //(n) - fc and # M (n) = / u (n) + k). What 
can we say about Yli^j xifit ■> ve s(S)), for an independent P* ~ T>1 Observe that since reg(5) is 
fixed, x(p|> re g('S')) an d x(9*> reg(5)) are identically distributed. 

Define (independent) 0-1 random variables Zj^ = ^(p^reg^)), and let Zj = Y2i<£j %J,i an d 
Zj = %J,i ■ We keep the subscript J because the region S uses fixed lines with indices in J. 
Given that one draw of Zj is in [gi(n), g u (n)], we want to give bounds on another draw. (This is 
basically a Bayesian problem, in that we effectively construct a prior over E[Zj]. Two Chernoff 
bounds suffice to perform the argument.) 

Claim 9.2. Consider a single draw of Zj and suppose that Zj G [gi(n), g u (n)], With probability at 
least 1 - n- c ' b /\ B[Zj] G [ 9l {n) /6,2g u (n)}. 

Proof. We use the Chernoff bounds [15, Theorem 1.1]. Suppose fi := E[Zj] < gi(n)/6. Then gi(n) > 
2ep,. Hence, Pr[Zj > gi(n)] < 2~ 91 ^ < n~ c ' h l 2 (noting that gi(n) = f L (n) - k > (db/2) log 2 n). 
With probability at least 1 - n c ' b / 2 , if Zj > gi(n), then E[Zj] > gi(n)/6. 

We repeat the argument with a lower tail Chernoff bound. Suppose fi > 2g u {n). Then Pr[Zj < 
g u (n)} < Pr[Zj < (1 - l/2)n] < e~9u{n)/A < n -c'b/4_ With probability at least 1 - n c ' fc / 4 , if 
Zj < g u ( n ), then E[Zj] < 2g u (n). A union bound completes the proof. □ 

For this claim, we conditioned on the set of lines S = {q* \ j G J} being fixed. But the claim 
holds regardless of what the conditioning is, and hence is true unconditionally. Therefore, for a fixed 
J, with probability at least l-n~ c ' 6/5 over Q* ~ V, if Zj G [gi(n), g u (n)], ~E[Zj] G \gi(ri)/6,2g u (n)]. 
Given that \Zj - Zj\ < b, this implies: if Zj G [fi(n)J u (n)], B[Zj] G [fi(n)/7, 3/„(n)]. 

There are 0(n b ) possible index sets J, so by a union bound the above holds with probability at 
least 1 — n~ c 6 / 6 for all J simultaneously. Suppose we chose a Q such that the condition above held 
(so VJ, E[Zj] G [fi(n)/7,3f u (n)]). Consider drawing P ~ V, so effectively an independent draw of 
Zj. Applying upper tail and lower tail Chernoff bounds again, for some sufficiently small constants 
a,p, Pt[Zj G [afi(n),f u {n)/a]} > 1 - exp(-^(n)) > 1 - n~ 3 . □ 



9.1 Learning the canonical directions 

We now describe how to obtain the canonical directions promised in Lemma 6.1. For this, we take 
a random input Q* ~ T>. 

Take the (log 4 n)-level of Q*, and let H' the upper convex hull of its vertices. Refer to Figure 17. 

Claim 9.3. The hull H' has the following properties: 

1. The curve H' lies below lev 2 i og i n (P*)- 

2. Each line of Q* either supports an edge of H' or intersects it at most twice. 

3. H' has 0(n) vertices. 

Proof. Consider any (internal) point p on H' , and let a, b be the vertices of H' such that p is 
between them in terms of ^-coordinate. Any line that goes under p must go under a or b. There 
are exactly log n lines under a (and b), since this is on the (log 4 n)-level. Hence, there are at most 
2 log 4 n points below p. 

The second property is a direct consequence of convexity. The third property follows from the 
second. Every vertex of H' is present on some line of Q* , and hence there can be at most 2n 
vertices. □ 
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Figure 17: The arrangement of Q*: for the sake of demonstration, the dark black line is the 4-level 
of the arrangement. The thick lighter line is H' , the convex hull of the vertices in the level. The 
shaded region is a possible trapezoid Tj. 

Let ro, n, . . . , rfe be the points given by the every log 2 n-th point in which a line in Q* meets 
H' (either as an intersection point or as an endpoint of a segment). For convenience, we will 
think of these as ordered from right to left. By Claim 9.3(2), there are 0(n/ log 2 n) such points 
rj. Let H be the upper convex hull of these points r^. Clearly H lies below H' . Draw a vertical 
downward ray through each vertex rj. This subdivides the region below H into semi-unbounded 
trapezoids tq,t\, . . . with the following properties: (i) each vertical boundary ray of a trapezoid Tj 
is intersected by at least log n and at most 2 log n lines of Q* (by Claim 9.3(1)); and (ii) the upper 
boundary segment of each Tj is intersected by at most log 2 n lines in Q* (by construction) . The 
shaded region in Figure 17 shows one such trapezoid. The next claim follows from an application 
of Lemma 9.2. 

Claim 9.4. With probability at least 1 — n~ 4 (over Q), the following holds for all trapezoids Tj: 
generate an independent P* ~ V. 

1. With probability (over P*) at least 1 — n~ 3 , there exists a line in P* that intersects both 
boundary rays of Tj; 

2. with probability (over P*) at least 1 — n -3 , at most log 5 n lines of P* intersect Tj. 

Proof. We apply Lemma 9.2 for both parts. For a set S = {£±,£2, £3, £4} of four lines, define reg(S') 
as the downward unbounded vertical trapezoid formed by the segment between intersection points 
of l\ , £2 and £3, £4. Note that all trapezoids Tj are of this form, where S is a set of four lines from Q* . 
Set x(£> T ) (f° r li ne £ an d trapezoid r) to 1 if £ intersects both parallel sides of r and otherwise. 

Since Tj is an unbounded trapezoid, a line that intersects it either intersects the upper segment 
or intersects both boundary rays. In our sample Q, the number of lines satisfying the former is at 
most log 2 n and the number satisfying the latter is in [log 4 n, 4 log 4 n] . Hence, the sum Ya=i x{Qi > T j) 
is at least log 4 n — log 2 n > (1/2) log 4 n and at most 5 log 4 n. By Lemma 9.2, the number of lines 
in P* intersecting both vertical lines is Sl(log 4 n) with probability at least 1 — n -3 . 

Now for the second part. We redefine x(£, r) to be 1 if £ intersects r and otherwise. Any line 
that intersects Tj must intersect one of the vertical boundaries, so X^=i x( a ii T j) G P°g 4 n i 4 log 4 n]. 
By Lemma 9.2, the number of lines in P* intersecting r,- is 0(log 4 n) with probability > 1 — ra~ 3 . □ 
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Proof of Lemma 6. 1 . Each point r j is dual to a line r* . We define the directions in V by taking 
upward unit normal to the lines r|. (Since the r^'s are ordered from right to left, this gives V in 
clockwise order.) It is clear that these directions can be found in 0(n poly log(ra)) time: we can 
compute lev log 4 n (P*) and its convex hull H' in O(npolylogre) time [13,14]. To determine the 
points rj, we perform 0(n) binary searches over H' , and then sort the intersection points. When 
rj is known, Vj can be found in constant time. 

Now consider a random P and its dual. The V-extremal vertices ej and ej+i correspond to the 
lowest line in P* that intersects the left and the right boundary ray of Tj. The number of extremal 
points between a and e^+i is the number of vertices on the lower envelope of P* between x{ri) 
and x(rj+i). By Claim 9.4(1), this lower envelope lies entirely inside Tj with probability at least 
1 - n~ 3 . By Claim 9.4(2) (and a union bound), the number lj of extremal points between e, and 
ej + i is at most log 5 n with probability at least 1 — 2n~ 3 . Thus, we have 

Ep^logO^ + l)] 

< E[X t log(log 5 n + l)\Yi< log 5 n] Pr[Fj < log 5 n] + E[X { log(Yi + 1) | Yi > log 5 n] Pv[Yi > log 5 n] 

< E[Xt | Yi < log 5 n] Pr[Yj < log 5 n]0 (log log n) + 0(n 2 )(l/2n 3 ) 

< E[Xi]0(loglogn) + 0(l). 

Adding over i, 

k k k 

J2^[Xilog(Yi + 1)] < ^E[Xi]0(loglogn) + 0(l) = e[J] X { ] O (log log n) + 0(n) 

i=l i=l i=l 

= O(nloglogn). 

□ 

9.2 Learning the lines 

To compute the canonical lines lj for the canonical directions Vj 6 V, we consider again the dual 
sample Q* from the previous section. The jth line tj is defined as the line that is dual to the point 
on lev 7C i ogn ((5) that has the same x-coordinate as rj. Here, 7 > is a sufficiently small constant. 
Note that tj is normal to Vj. This can be constructed in 0(n poly(logn)) time. We restate the 
main technical part of Lemma 6.2. 

Lemma 9.5. With probability at least 1 — n -4 over the construction of £1,^2, ■ ■ ■> for every tj, 

Pr^[\t+nP\ G [l,clogn]] > 1 - n~ 3 . 

Proof. For convenience, we will go back to dual space. Let Sj be the point that corresponds to tj. 
A point p G tj iff p* intersects that downward vertical ray from Sj (call this Rj) 

We set up an application of Lemma 9.2. For a pair of lines t\,t2 (all in dual space), define 
reg(^i,^2) to be the downward vertical ray from the intersection of t\ and £2- Note that any Sj is 
formed by the intersection of two lines from Q* . For such a region R and line I' , set x{^' '> R) to be 
1 if £' intersects R and otherwise. Note that, by construction, ^ x(<3f> Rj) = 7clogn. We apply 
Lemma 9.2. With probability at least 1 — n -4 over Q* (for sufficiently large c and small enough 
7), Pip~v\EiX(Pi,Rj) e [l.clogn]] > 1-n" 3 . □ 
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